pushback iterator
Matus
matusu at gmail.com
Sun May 17 16:17:17 EDT 2009
Luis Alberto Zarrabeitia Gomez wrote:
> Quoting Mike Kazantsev <mk.fraggod at gmail.com>:
>
>> And if you're "pushing back" the data for later use you might just as
>> well push it to dict with the right indexing, so the next "pop" won't
>> have to roam thru all the values again but instantly get the right one
>> from the cache, or just get on with that iterable until it depletes.
>>
>> What real-world scenario am I missing here?
>
> Other than one he described in his message? Neither of your proposed solutions
> solves the OP's problem. He doesn't have a list (he /could/ build a list, and
> thus defeat the purpose of having an iterator). He /could/ use alternative data
> structures, like the dictionary you are suggesting... and he is, he is using his
> pushback iterator, but he has to include it over and over.
>
> Currently there is no good "pythonic" way of building a functions that decide to
> stop consuming from an iterator when the first invalid input is encountered:
> that last, invalid input is lost from the iterator. You can't just abstract the
> whole logic inside the function, "something" must leak.
>
> Consider, for instance, the itertools.dropwhile (and takewhile). You can't just
> use it like
>
> i = iter(something)
> itertools.dropwhile(condition, i)
> # now consume the rest
>
> Instead, you have to do this:
>
> i = iter(something)
> i = itertools.dropwhile(condition, i)
> # and now i contains _another_ iterator
> # and the first one still exists[*], but shouldn't be used
> # [*] (assume it was a parameter instead of the iter construct)
>
> For parsing files, for instance (similar to the OP's example), it could be nice
> to do:
>
> f = file(something)
> lines = iter(f)
> parse_headers(lines)
> parse_body(lines)
> parse_footer(lines)
>
that is basically one of many possible scenarios I was referring to.
other example would be:
----------------------------------------------------
iter = Pushback_wrapper( open( 'my.file' ).readlines( ) )
for line in iter:
if is_outer_scope( line ):
'''
do some processing for this logical scope of file. there is only fet
outer scope lines
'''
continue
for line in iter:
'''
here we expect 1000 - 2000 lines of inner scope and we do not want to
run is_outer_scope()
for every line as it is expensive, so we decided to reiterate
'''
if is_inner_scope( line ):
'''
do some processing for this logical scope of file untill outer scope
condition occurs
'''
elif is_outer_scope( line ):
iter.pushback( line )
break
else:
'''flush line'''
----------------------------------------------------
> which is currently impossible.
>
> To the OP: if you don't mind doing instead:
>
> f = file(something)
> rest = parse_headers(f)
> rest = parse_body(rest)
> rest = parse_footer(rest)
>
> you could return itertools.chain([pushed_back], iterator) from your parsing
> functions. Unfortunately, this way will add another layer of itertools.chain on
> top of the iterator, you will have to hope this will not cause a
> performace/memory penalty.
>
> Cheers,
>
More information about the Python-list
mailing list