isplit

Thu Jan 26 07:48:57 EST 2006

I have a file of lines that contains some extraneous chars, this the
basic version of code to process it:

IDtable = "".join(map(chr, xrange(256)))
text = file("...", "rb").read().translate(IDtable, toRemove)
for raw_line in file(file_name):
  line = raw_line.translate(IDtable, toRemove)
  ...

A faster alternative:

IDtable = "".join(map(chr, xrange(256)))
text = file(file_name).read().translate(IDtable, toRemove)
for line in text.split("/n"):
  ...

But text.split requires some memory if the text isn't small.
Probably there are simpler solutions (solutions with the language as it
is now), but one seems the following, an:

str.isplit()
or
str.itersplit()
or
str.xsplit()
Like split, but iterative.

(Or even making str.split() itself an iterator (for Py3.0), and
str.listsplit() to generate lists.)
(At the moment a simple RE can probably work as the isplit.)

Bye,
bearophile