Refactor a buffered class...
mahs at telcopartners.com
Wed Sep 6 23:35:03 CEST 2006
lh84777 at yahoo.fr wrote:
> actually for the example i have used only one sentry condition by they
> are more numerous and complex, also i need to work on a huge amount on
> data (each word are a line with many features readed from a file)
An open (text) file is a line-based iterator that can be fed directly to
'chunker'. As for different sentry conditions, I imagine they can be coded in
either model. How much is a 'huge amount' of data?
>> to have:
>> this .
>> this . is a .
>> this . is a . test to .
>> is a . test to . check if it .
>> test to . check if it . works .
>> check if it . works . well .
>> works . well . it looks like .
> well . it looks like .
> it looks like .
Here's a small update to the generator that allows optional handling of the head
and the tail:
def chunker(s, chunk_size=3, sentry=".", keep_first = False, keep_last = False):
sentry_count = 0
for item in s:
if item == sentry:
sentry_count += 1
if sentry_count < chunk_size:
>>> for p in chunker(s.split(), keep_first = True, keep_last=True): print "
this . is a .
this . is a . test to .
is a . test to . check if it .
test to . check if it . works .
check if it . works . well .
works . well . it looks like .
well . it looks like .
it looks like .
More information about the Python-list