pushback iterator

Matus matusu at gmail.com
Sun May 17 22:17:17 CEST 2009



Luis Alberto Zarrabeitia Gomez wrote:
> Quoting Mike Kazantsev <mk.fraggod at gmail.com>:
> 
>> And if you're "pushing back" the data for later use you might just as
>> well push it to dict with the right indexing, so the next "pop" won't
>> have to roam thru all the values again but instantly get the right one
>> from the cache, or just get on with that iterable until it depletes.
>>
>> What real-world scenario am I missing here?
> 
> Other than one he described in his message? Neither of your proposed solutions
> solves the OP's problem. He doesn't have a list (he /could/ build a list, and
> thus defeat the purpose of having an iterator). He /could/ use alternative data
> structures, like the dictionary you are suggesting... and he is, he is using his
> pushback iterator, but he has to include it over and over.
> 
> Currently there is no good "pythonic" way of building a functions that decide to
> stop consuming from an iterator when the first invalid input is encountered:
> that last, invalid input is lost from the iterator. You can't just abstract the
> whole logic inside the function, "something" must leak.
> 
> Consider, for instance, the itertools.dropwhile (and takewhile). You can't just
> use it like
> 
> i = iter(something)
> itertools.dropwhile(condition, i)
> # now consume the rest
> 
> Instead, you have to do this:
> 
> i = iter(something)
> i = itertools.dropwhile(condition, i) 
> # and now i contains _another_ iterator
> # and the first one still exists[*], but shouldn't be used
> # [*] (assume it was a parameter instead of the iter construct)
> 
> For parsing files, for instance (similar to the OP's example), it could be nice
> to do:
> 
> f = file(something)
> lines = iter(f)
> parse_headers(lines)
> parse_body(lines)
> parse_footer(lines)
> 

that is basically one of many possible scenarios I was referring to.
other example would be:

----------------------------------------------------
iter = Pushback_wrapper( open( 'my.file' ).readlines( ) )
for line in iter:
	if is_outer_scope( line ):
		'''
		do some processing for this logical scope of file. there is only fet
outer scope lines
		'''
		continue
	
	for line in iter:
		'''
		here we expect 1000 - 2000 lines of inner scope and we do not want to
run is_outer_scope()
		for every line as it is expensive, so we decided to reiterate
		'''
		if is_inner_scope( line ):
			'''
			do some processing for this logical scope of file untill outer scope
condition occurs
			'''
		elif is_outer_scope( line ):
			iter.pushback( line )
			break
		else:
			'''flush line'''
----------------------------------------------------

> which is currently impossible.
> 
> To the OP: if you don't mind doing instead:
> 
> f = file(something)
> rest = parse_headers(f)
> rest = parse_body(rest)
> rest = parse_footer(rest)
> 
> you could return itertools.chain([pushed_back], iterator) from your parsing
> functions. Unfortunately, this way will add another layer of itertools.chain on
> top of the iterator, you will have to hope this will not cause a
> performace/memory penalty.
> 
> Cheers,
> 



More information about the Python-list mailing list