Canonical way of dealing with null-separated lines?

Douglas Alan nessus at mit.edu
Tue Mar 1 12:23:43 EST 2005


"John Machin" <sjmachin at lexicon.net> writes:

>>        lines = (partialLine + charsJustRead).split(newline)

> The above line is prepending a short string to what will typically be a
> whole buffer full. There's gotta be a better way to do it.

If there is, I'm all ears.  In a previous post I provided code that
doesn't concatinate any strings together until the last possible
moment (i.e. when yielding a value).  The problem with that the code
was that it was complicated and didn't work right in all cases.

One way of solving the string concatination issue would be to write a
string find routine that will work on lists of strings while ignoring
the boundaries between list elements.  (I.e., it will consider the
list of strings to be one long string for its purposes.)  Unless it is
written in C, however, I bet it will typically be much slower than the
code I just provided.

> Perhaps you might like to refer back to CdV's solution which was
> prepending the residue to the first element of the split() result.

The problem with that solution is that it doesn't work in all cases
when the line-separation string is more than one character.

>>        for line in lines: yield line + outputLineEnd

> In the case of leaveNewline being false, you are concatenating an empty
> string. IMHO, to quote Jon Bentley, one should "do nothing gracefully".

In Python,

   longString + "" is longString

evaluates to True.  I don't know how you can do nothing more
gracefully than that.

|>oug



More information about the Python-list mailing list