Add peekline(), peeklines(n) and optional maxlines argument to readlines()
It seems there could be a cleaner way of reading the first n lines of a file and additionally not seeking past those lines (ie peek). This is obviously a trivial task for 1 line ie... f.readline() f.seek(0) but one that I think would make sense to add to the IO implementation, given that we already have readline, readlines, and peek I think peekline() or peeklines(n) is only a natural addition. The argument for doing so (in 3.3 of course), is primarily readability but also that the maintenance burden *seems* like it would be low. This addition would also be helpful and more concise where n > 1. I think readlines() should also take an optional argument for a max number of lines to read. It seems more common/helpful to me than 'hint' for max bytes. In n>1 case one could do... f.readlines(maxlines=10) or for the 'peek' case f.peeklines(10) I also didn't find any of the answers from http://stackoverflow.com/questions/1767513/read-first-n-lines-of-a-file-in-p... to be very compelling. I am more than willing to propose a patch if the idea(s) are supported. - John
+0.5 about f.readlines(maxlines=10); it might be a good addition since the sizehint parameter does not allow an actual prediction of how many lines will be returned. On the other side the two arguments are mutually exclusive, therefore I'm not sure what expect in case both are specified (maybe ValueError, but such kind of APIs always leave me a little skeptical). +0 about f.peeklines(10) as it only saves one line of code: f.peeklines(10) f.seek(0) ...or 2 in case you're not at the beginning of the file. before = f.tell() f.peeklines(10) f.seek(before) Not a great advantage vs. the fact of introducing (and remembering) a new function, in my opinion. Regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ <http://code.google.com/p/pyftpdlib/> 2011/9/30 John O'Connor <jxo6948@rit.edu>
It seems there could be a cleaner way of reading the first n lines of a file and additionally not seeking past those lines (ie peek). This is obviously a trivial task for 1 line ie... f.readline() f.seek(0) but one that I think would make sense to add to the IO implementation, given that we already have readline, readlines, and peek I think peekline() or peeklines(n) is only a natural addition. The argument for doing so (in 3.3 of course), is primarily readability but also that the maintenance burden *seems* like it would be low. This addition would also be helpful and more concise where n > 1.
I think readlines() should also take an optional argument for a max number of lines to read. It seems more common/helpful to me than 'hint' for max bytes. In n>1 case one could do...
f.readlines(maxlines=10)
or for the 'peek' case
f.peeklines(10)
I also didn't find any of the answers from
http://stackoverflow.com/questions/1767513/read-first-n-lines-of-a-file-in-p... to be very compelling.
I am more than willing to propose a patch if the idea(s) are supported.
- John _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
Whops! In my peekline examples obviously f.peeklines must be replaced with f.readlines. =) Regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ <http://code.google.com/p/pyftpdlib/> 2011/9/30 Giampaolo Rodolà <g.rodola@gmail.com>
+0.5 about f.readlines(maxlines=10); it might be a good addition since the sizehint parameter does not allow an actual prediction of how many lines will be returned. On the other side the two arguments are mutually exclusive, therefore I'm not sure what expect in case both are specified (maybe ValueError, but such kind of APIs always leave me a little skeptical). +0 about f.peeklines(10) as it only saves one line of code:
f.peeklines(10) f.seek(0)
...or 2 in case you're not at the beginning of the file.
before = f.tell() f.peeklines(10) f.seek(before)
Not a great advantage vs. the fact of introducing (and remembering) a new function, in my opinion.
Regards,
--- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ <http://code.google.com/p/pyftpdlib/>
2011/9/30 John O'Connor <jxo6948@rit.edu>
It seems there could be a cleaner way of reading the first n lines of a file and additionally not seeking past those lines (ie peek). This is obviously a trivial task for 1 line ie... f.readline() f.seek(0) but one that I think would make sense to add to the IO implementation, given that we already have readline, readlines, and peek I think peekline() or peeklines(n) is only a natural addition. The argument for doing so (in 3.3 of course), is primarily readability but also that the maintenance burden *seems* like it would be low. This addition would also be helpful and more concise where n > 1.
I think readlines() should also take an optional argument for a max number of lines to read. It seems more common/helpful to me than 'hint' for max bytes. In n>1 case one could do...
f.readlines(maxlines=10)
or for the 'peek' case
f.peeklines(10)
I also didn't find any of the answers from
http://stackoverflow.com/questions/1767513/read-first-n-lines-of-a-file-in-p... to be very compelling.
I am more than willing to propose a patch if the idea(s) are supported.
- John _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Fri, Sep 30, 2011 at 5:42 AM, Giampaolo Rodolà <g.rodola@gmail.com> wrote:
...or 2 in case you're not at the beginning of the file. before = f.tell() f.peeklines(10) f.seek(before)
A context manager to handle the tell()/seek() may be an interesting and more general purpose idea: # In the io module class _TellSeek: def __init__(self, f): self._f = f def __enter__(self): self._position = self._f.tell() def __exit__(self, *args): self._f.seek(self._position) def restore_position(f): return _TellSeek(f) # Usage with io.restore_position(f): for i, line in enumerate(f, 1): # Do stuff if i == 10: break else: # Oops, didn't get as many lines as we wanted Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, 30 Sep 2011 07:28:19 -0400 Nick Coghlan <ncoghlan@gmail.com> wrote:
On Fri, Sep 30, 2011 at 5:42 AM, Giampaolo Rodolà <g.rodola-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
...or 2 in case you're not at the beginning of the file. before = f.tell() f.peeklines(10) f.seek(before)
A context manager to handle the tell()/seek() may be an interesting and more general purpose idea:
But it still only works on seekable streams. Some time ago I proposed a general prefetch() method that would allow easy protocol-specific buffering on top of non-seekable streams, but there didn't seem to be a lot of enthusiasm at the time: http://mail.python.org/pipermail/python-ideas/2010-September/008179.html Regards Antoine.
Some time ago I proposed a general prefetch() method that would allow easy protocol-specific buffering on top of non-seekable streams
There is also the issue for it here http://bugs.python.org/issue12053. - John
I thought of a somewhat elegant recipe for the readlines(maxlines) case: f = open(...)itertools.islice(f, maxlines) - John
On Fri, Sep 30, 2011 at 13:28, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Fri, Sep 30, 2011 at 5:42 AM, Giampaolo Rodolà <g.rodola@gmail.com> wrote:
...or 2 in case you're not at the beginning of the file. before = f.tell() f.peeklines(10) f.seek(before)
A context manager to handle the tell()/seek() may be an interesting and more general purpose idea:
# In the io module class _TellSeek: def __init__(self, f): self._f = f def __enter__(self): self._position = self._f.tell() def __exit__(self, *args): self._f.seek(self._position)
def restore_position(f): return _TellSeek(f)
# Usage with io.restore_position(f): for i, line in enumerate(f, 1): # Do stuff if i == 10: break else: # Oops, didn't get as many lines as we wanted
This is useful, and made simpler by contextlib.contextmanager. Actually I just posted this snipped on G+ a couple of weeks ago, since I found it very useful for some stream-massaging code I was writing. And yes, it only makes sense for seekable streams, of course. So I'm -1 on the peeklines request, since it's easily implemented by other means. Eli
On Tue, Oct 4, 2011 at 1:59 PM, Eli Bendersky <eliben@gmail.com> wrote:
This is useful, and made simpler by contextlib.contextmanager.
Yeah, the only reason I wrote it out by hand is that if it *did* go into the io module, we wouldn't want to depend on contextlib for it. However, as Antoine pointed out, it only works for seekable streams and probably isn't general purpose enough to actually be included in the io module. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (5)
-
Antoine Pitrou
-
Eli Bendersky
-
Giampaolo Rodolà
-
John O'Connor
-
Nick Coghlan