Record separator for readlines()
Bengt Richter
bokr at oz.net
Sat Sep 3 00:31:14 EDT 2005
On Fri, 2 Sep 2005 22:10:18 -0500, jepler at unpythonic.net wrote:
>
>--SkvwRMAIpAhPCcCJ
>Content-Type: text/plain; charset=us-ascii
>Content-Disposition: inline
>
>I think you still have to roll your own.
>
>Here's a start:
> def ireadlines(f, s='\n', bs=4096):
> if not s: raise ValueError, "separator must not be empty"
> r = []
> while 1:
> b = f.read(bs)
> if not b: break
> ofs = 0
> while 1:
> next = b.find(s, ofs)
> if next == -1: break
> next += len(s)
> yield ''.join(r) + b[ofs:next]
> del r[:]
> ofs = next
> r.append(b[ofs:])
> yield ''.join(r)
>
What if len(s)>1 and read(bs) reads a partial s?
I posted file splitter some time back which UIGoofed handles that
(still not tested beyond the shown examples, so caveat utor(??) ;-)
http://groups.google.com/group/comp.lang.python/msg/e333f8b2e2fcdc49
Thought I might be missing something, but
>>> def ireadlines(f, s='\n', bs=4096):
... if not s: raise ValueError, "separator must not be empty"
... r = []
... while 1:
... b = f.read(bs)
... if not b: break
... ofs = 0
... while 1:
... next = b.find(s, ofs)
... if next == -1: break
... next += len(s)
... yield ''.join(r) + b[ofs:next]
... del r[:]
... ofs = next
... r.append(b[ofs:])
... yield ''.join(r)
...
>>> from StringIO import StringIO as SIO
>>> f = SIO('123xx678xxCxx_and so forth')
>>> for s in ireadlines(f,'xx',4): print repr(s),
...
'123xx678xx' 'Cxx_and so forth'
>>> for s in ireadlines(f,'xx',5): print repr(s),
...
''
oops
>>> f.seek(0)
>>> for s in ireadlines(f,'xx',5): print repr(s),
...
'123xx' '678xx' 'Cxx' '_and so forth'
Regards,
Bengt Richter
More information about the Python-list
mailing list