doctest.testfile fails on text files with Windows line endings
pmaupin at gmail.com
Sun Apr 11 00:01:41 EDT 2010
On Apr 10, 10:16 pm, Steven D'Aprano <st... at REMOVE-THIS-
> After converting a text file containing doctests to use Windows line
> endings, I'm getting spurious errors:
> ValueError: line 19 of the docstring for examples.txt has inconsistent
> leading whitespace: '\r'
> I don't believe that doctest.testfile is documented as requiring Unix
> line endings, and the line endings in the file are okay. I've checked in
> a hex editor, and they are valid \r\n line endings.
> In doctest._load_testfile, I find this comment and code:
> # get_data() opens files as 'rb', so one must do the equivalent
> # conversion as universal newlines would do.
> return file_contents.replace(os.linesep, '\n'), filename
> which I read as an attempt to normalise line endings in the file to \n.
> (But surely this will fail? If you're running, say, Linux or MacOS,
> linesep will already be '\n' not '\r\n', and consequently the replace
> does nothing, any Windows line endings aren't normalised, and doctest
> will choke on the \r characters. It's only useful if running on Windows.)
> But the above only occurs when using a package loader. Otherwise,
> _load_testfile executes:
> return open(filename).read(), filename
> which doesn't do any line ending normalisation at all.
> To my mind, this is a bug in doctest. Does anyone disagree? I think the
> simplest fix is to change it to:
> return open(filename, 'rU').read(), filename
Seems like a bug to me. I often assume that I don't know where a
string is coming from, so one of the first steps I usually take when
parsing a string is:
s = s.replace('\r\n', '\n').replace('\r', '\n')
And, out of long-standing pre-Python habit, I always open files in
binary mode and then have my way with them. I know universal mode is
available, but honestly, I don't care for all the bookkeeping on what
kinds of line endings have been seen -- I just want to normalize the
More information about the Python-list