[Chicago] is there really no built-in file/iter split() thing?
Kumar McMillan
kumar.mcmillan at gmail.com
Sun Dec 2 07:23:30 CET 2007
On Dec 1, 2007 11:50 PM, Massimo Di Pierro <mdipierro at cs.depaul.edu> wrote:
> Try this
>
> import re, mmap
> file=open(filename,'r')
> mfile=mmap.mmap(file.fileno(),0,prot=mmap.PROT_READ)
> items=re.compile('[^;]+').finditer(mfile)
> for item in items: print item.group()
nice! I didn't know about mmap.
>
> Massimo
>
> On Nov 30, 2007, at 3:49 PM, Kumar McMillan wrote:
>
>
> > [In the hope that Chris has another awesome response...]
> >
> > Here is another: I have a big sql file (45M) and need to iter through
> > the statements---no fancy sql parsing, I just want the statements.
> > Assuming open('big.sql').read().split(';') would be a dumb idea, I
> > couldn't find anything in stdlib to do this. What am I missing? I
> > thought the tokenize module would but I couldn't see how at first
> > glance.
> >
> > def readsplit(filelike, token):
> > """yields each chunk between tokens in contents of filelike
> > object.
> >
> > For example::
> >
> >>>> [c for c in readsplit(StringIO('''bad; ass; elf in
> > ... the forest;'''), ';')]
> > ...
> > ['bad', ' ass', ' elf in \\nthe forest', '']
> >>>> [c for c in readsplit(StringIO(''';
> > ... 1,2,3;
> > ... and 4; and
> > ... even 5'''), ';')]
> > ...
> > ['', '\\n1,2,3', '\\n and 4', ' and\\neven 5']
> >>>>
> >
> > """
> > buf = []
> > for line in filelike:
> > buf.append(line)
> > line = ''.join(buf)
> > buf[:] = []
> > chunks = line.split(';')
> > for chunk in chunks[:-1]:
> > yield chunk
> > buf.append(chunks[-1])
> > if len(buf):
> > yield ''.join(buf)<readsplit.py><ATT00001>
>
>
> _______________________________________________
> Chicago mailing list
> Chicago at python.org
> http://mail.python.org/mailman/listinfo/chicago
>
More information about the Chicago
mailing list