Refactoring a generator function
max
maxume at yahoo.com
Sat Dec 4 12:48:41 EST 2004
Kent Johnson <kent3737 at yahoo.com> wrote in
news:41b1d509$1_3 at newspeer2.tds.net:
> Here is a simple function that scans through an input file and
> groups the lines of the file into sections. Sections start with
> 'Name:' and end with a blank line. The function yields sections
> as they are found.
>
> def makeSections(f):
> currSection = []
>
> for line in f:
> line = line.strip()
> if line == 'Name:':
> # Start of a new section
> if currSection:
> yield currSection
> currSection = []
> currSection.append(line)
>
> elif not line:
> # Blank line ends a section
> if currSection:
> yield currSection
> currSection = []
>
> else:
> # Accumulate into a section
> currSection.append(line)
>
> # Yield the last section
> if currSection:
> yield currSection
>
> There is some obvious code duplication in the function - this bit
> is repeated 2.67 times ;-):
> if currSection:
> yield currSection
> currSection = []
>
> As a firm believer in Once and Only Once, I would like to factor
> this out into a separate function, either a nested function of
> makeSections(), or as a separate method of a class
> implementation. Something like this:
>
>
> The problem is that yieldSection() now is the generator, and
> makeSections() is not, and the result of calling yieldSection()
> is a new iterator, not the section...
>
> Is there a way to do this or do I have to live with the
> duplication?
>
> Thanks,
> Kent
>
>
This gets rid of some duplication by ignoring blanklines altogether,
which might be a bug...
def makeSections2(f):
currSection = []
for line in f:
line = line.strip()
if line:
if line == 'Name:':
if currSection:
yield cs
currSection = []
currSection.append(line)
if currSection:
yield currSection
but
def makeSections2(f):
currSection = []
for line in f:
line = line.strip()
if line:
if line == 'Name:':
if currSection:
yield currSection
currSection = []
currSection.append(line)
elif currSection:
yield currSection
if currSection:
yield currSection
should be equivalent.
More information about the Python-list
mailing list