[Csv] Something's fishy w/ Mac line endings...

sjmachin at lexicon.net sjmachin at lexicon.net
Thu Aug 21 00:59:24 CEST 2003


On 19 Aug 2003 at 13:56, Andrew McNamara wrote:

> The problem is that our end of line processing is incompatible with the
> use of an iterator as the source of input lines - there is no satisfactory
> answer that allows us to retain both.

Using an iterator as a source of what? Lines, you say? The documentation says it 
"iterates over lines" [what does that mean?] and that the iterator should return "strings", 
without saying what they should contain, how they should be terminated, etc. See 
examples with commentary below.

> 
> The requirement that the input file be opened in binary mode for what
> is obviously a text format is going to a never ending source of suprise
> for people using the module, and seems like a bigger wart than the one
> we're now facing.
> 

I agree on the surprise factor with binary mode. It's not obvious what the purpose is. 
How does Excel on the Mac terminate lines in CSV files? CR or CRLF?

>>> alist= ['aaa,bbb,ccc', 'ddd,eee', 'fff']
>>> [x for x in csv.reader(alist)]
[['aaa', 'bbb', 'ccc'], ['ddd', 'eee'], ['fff']]
# so we don't need line terminators

>>> blist= ['aaa,bbb,ccc\n', 'ddd,eee\n', 'fff\n']
>>> [x for x in csv.reader(blist)]
[['aaa', 'bbb', 'ccc'], ['ddd', 'eee'], ['fff']]
# but if they are supplied, they are ignored

>>> clist= ['aaa,bbb\nccc\n', 'ddd,eee\n', 'fff\n']
>>> [x for x in csv.reader(clist)]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
_csv.Error: newline inside string
# except when embedded in an unquoted string/line

>>> dlist= ['aaa,"bbb\nccc",qqq\n', 'ddd,eee\n', 'fff\n']
>>> [x for x in csv.reader(dlist)]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
_csv.Error: newline inside string
# whoops, we really do have to pretend we are reading a file in *TEXT* mode (see next 
example)

>>> elist= ['aaa,"bbb\n', 'ccc",qqq\n', 'ddd,eee\n', 'fff\n']
>>> [x for x in csv.reader(elist)]
[['aaa', 'bbb\nccc', 'qqq'], ['ddd', 'eee'], ['fff']]
# Wow, how do we explain all that to J. Random Newbie?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/csv/attachments/20030821/5b030561/attachment.htm 


More information about the Csv mailing list