Newbie design problem

John Machin sjmachin at lexicon.net
Thu Dec 13 18:24:04 EST 2007


On Dec 14, 6:32 am, MartinRineh... at gmail.com wrote:
> Thanks to a lot of help, I've got the outer framework for my tokenizer
> down to this:
>
>     for line_number, line in enumerate(text):
>         output = ''
>
>         for char_number, char in enumerate(line):
>             output += char
>
>         print 'At ' + str(line_number) + ', '+ str(char_number) + ': '
> + output,
>

The inner loop appears to be utterly redundant; AFAIK it can be
replaced by:

    output = line[:]
    char = line[-1] if line else ''
    char_number = len(line)

with the observation that if "line" is empty, your code will crash if
it's the first line, and give misleading values (those belonging to
the most recent non-empty line) for "char" and "char_number"
otherwise.

You mentioned design: I wouldn't call that the outer framework for a
tokeniser; I'd call it an example of one way of collecting the source
to be stuffed into a not yet visible tokeniser i.e. the tokeniser
should be in a separate module with an API that allows you to (a) push
chunks of text into the tokeniser or (b) requires you to supply an
iterable object so that the tokeniser can pull text; the tokeniser
itself should not live inside nested loops in your application code.



More information about the Python-list mailing list