Re: [Numpy-discussion] ANN: Numpy 1.6.0 beta 2
On 4/5/11 10:33 PM, Matthew Brett wrote:
Did you mean to send this just to me? It seems like the whole is generally interesting and helpful, at least to me...
I did mean to send to the list -- I've done that now.
Well, the current code doesn't split on \r in py3k, admittedly that must be a rare case.
I guess that's a key question here -- It really *should* split on /r, but maybe it's rare enough to be unimportant.
The general point about whether to read binary or text is also in play. I agree with you, reading text sounds like a better idea to me, but I don't know the performance hit. Pauli had it as reading binary in the original conversion and was defending that in an earlier email...
The thing is -- we're talking about parsing text here, so we really are reading text files, NOT binary files. So the question really is -- do we want py3's file reading code to handle encoding issues for us, or do we want to handle them ourselves. If we only want to support ansi encodings, then handling ourselves may well be easier and faster performing. If we go that way we need to handle line-endings, too. The easy way is to only support line endings with a '\n' in them -- that works out of the box. But it's not that hard to support 'r' also, depending on how you want to do it. Before 'U' existed, I did that all the time, something like: some_text = file.read(some_size_buffer) some_text.replace('\r\n', '\n') some_text.replace('\r', '\n') lines = some_text.split('\n') (by the way, if you're going to support this, it's really nice to support mixed line-endings (like this approach does) -- there are a lot of editors that can make a mess of line endings. If you can read the entire file into memory at once, this is almost trivial, if you can't -- there is a bit more bookeeping code to be written. DARN -- I think I said my last note was the last on this topic! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (1)
-
Christopher Barker