[Python-ideas] Re: TextIOWrapper support for null-terminated lines

25 Oct 2020

      On Mon, Oct 26, 2020 at 8:44 AM Cameron Simpson  wrote:
...
On 24Oct2020 13:37, Dan Sommers <2QdxY4RzWzUUiLuE@potatochowder.com> wrote:
...
On 2020-10-24 at 12:29:01 -0400,
Brian Allen Vanderburg II via Python-ideas  wrote:
...
... Find can output it's filenames in null-terminated lines since it
is possible to have newlines in a filename(yuck) ...
Spaces in filenames are just as bad, and much more common:
But much easier to handle in simple text listings, which are newline delimited.
You're really running into a horrible behaviour from xargs, which is one
reason why GNU parallel exists.
I don't consider the behaviour horrible, and xargs isn't the only
thing to do this - other tools can be put into zero-termination mode
too.

But it's pretty rare to consume huge amounts of data in this way
(normally it'll just be a list of file names), so what I would do is
simply read the entire thing, then split on "\0". It's not like
reading a gigabyte of log file, where you really want to work line by
line and not read in more than you need; it's easily going to fit into
memory.

If you actually DO need to read null-terminated records from a file
that's too big for memory, it's probably worth just rolling your own
buffering, reading a chunk at a time and splitting off the interesting
parts. It's not hugely difficult, and it's a good exercise to do now
and then. And yes, I can see the temptation to get Python to do it,
but unfortunately, newline support is such a weird mess of
cross-platform support that I don't think it needs to be made more
complicated :)

ChrisA

[Python-ideas] Re: TextIOWrapper support for null-terminated lines

Chris Angelico