[issue5445] codecs.StreamWriter.writelines problem when passed generator

Marc-Andre Lemburg report at bugs.python.org
Tue Mar 10 16:52:56 CET 2009


Marc-Andre Lemburg <mal at egenix.com> added the comment:

On 2009-03-10 16:36, Daniel Lescohier wrote:
> Daniel Lescohier <daniel.lescohier at cbs.com> added the comment:
> 
> Let me give an example of why it's important that writelines 
> iteratively writes.  For:
> 
> rows = (line[:-1].split('\t') for line in in_file)
> projected = (keep_fields(row, 0, 3, 7) for row in rows)
> filtered = (row for row in projected if row[2]=='1')
> out_file.writelines('\t'.join(row)+'\n' for row in filtered)
> 
> For a large input file, for a regular out_file object, this will work. 
> For a codecs.StreamWriter wrapped out_file object, this won't work, 
> because it's not following the file protocol that writelines should 
> iteratively write.

Of course, it's possible to have a generator producing lots of data,
but that does not warrant making most uses of that method slow.

If you'd like to see such support in .writelines(), please provide
an implementation that follows the approach of the file object
implementation in Python 2.x: it writes the lines in chunks of
1000 lines each, if it finds that the input object is not a sequence
(actually, it's stricter than that for some: it requires a Python
list).

The standard case of passing a list of strings to that method
should not get slower because of this.

BTW: I am not aware of any .writelines() file protocol. If there
were such a protocol, .readlines() would also have to return
an iterator (which it doesn't). The idea behind .writelines()
is to be able to write back a list generated with .readlines().

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

----------
message_count: 5.0 -> 6.0

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5445>
_______________________________________


More information about the Python-bugs-list mailing list