file reading by record separator (not line by line)

Steve Howell showell30 at yahoo.com
Thu May 31 20:46:47 EDT 2007


--- Marc 'BlackJack' Rintsch <bj_666 at gmx.net> wrote:
> There was just recently a thread with a
> `itertools.groupby()` solution. 

Yes, indeed. I think it's a very common coding problem
(with plenty of mostly analogous variations) that has
these very common pitfalls:

  1) People often forget to handle the last block. 
This is not quite exactly an OBOE (off-by-one-error)
in the classic sense, but it's an OBOE-like thing
waiting to happen.

  2) Even folks who solve this correctly won't always
solve it idiomatically.

  3) The problem oftens comes up with the added
complication of a non-finite data stream (snooping on
syslog, etc.).

I think itertools.groupby() is usually the key
batteries-included component in elegant solutions to
this problem, but I wonder if the Python community
couldn't help a lot of newbies (or insufficiently
caffeinated non-newbies) by any of the following:

  1) Add a function to some Python module (maybe not
itertools?) that implements something to the effect of
group_blocks(identify_block_start_method).

  2) Address this in the cookbook.

  3) Promote this problem as a classic use case of
itertools.groupby() (despite the function's
advancedness), and provide helpful examples in the
itertools docs.

Thoughts?




       
____________________________________________________________________________________
Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
http://sims.yahoo.com/  



More information about the Python-list mailing list