My first Python program -- a lexer
thomas at mlynarczyk-webdesign.de
Sun Nov 9 15:39:30 CET 2008
John Machin schrieb:
> Be consistent with your punctuation style. I'd suggest *not* having a
> space after ( and before ), as in the previous line. Read
What were the reasons for preferring (foo) over ( foo )? This PEP gives
recommendations for coding style, but (naturally) it does not mention
the reasons why the recommended way is preferrable. I suppose these
matters have all been discussed -- is there a synopsis available?
>> self.source = re.sub( r"\r?\n|\r\n", "\n", source )
> Firstly, would you not expect to be getting your text from a text file
> (perhaps even one opened with the universal newlines option) i.e. by
> the time it's arrived here, source has already had \r\n changed to \n?
I was not aware of the universal newlines option. This would then indeed
make my newline conversion superfluous.
> Secondly, that's equivalent to
> re.sub(r"\n|\r\n|\r\n", "\n", source)
My mistake. I meant r"\r?\n|\r" ("\n", "\r\n" or "\r").
> Thirdly, if source does contain \r\n and there is an error, the
> reported value of offset will be incorrect. Consider retaining the
> offset of the last newline seen, so that your error reporting can
> include the line number and (include or use) the column position in
> the line.
Indeed, I had not thought of that detail -- if I mess with the newlines,
the offset will be wrong with respect to the original source. But with
the universal newlines option mentioned above, the problem is already
>> while self.offset < len( self.source ):
> You may like to avoid getting len(self.source) for each token.
Yes, I should change that. Unless there is a more elegant way do detect
the end of the source?
>> for name, regex in self.tokens.iteritems():
> dict.iter<anything>() will return its results in essentially random
Ouch! I must do something about that. Thanks for pointing it out. So if
I want a certain order, I must use a list of tuples? Or is there a way
to have order with dicts?
>> return "\n".join(
>> [ "[L:%s]\t[O:%s]\t[%s]\t'%s'" %
> For avoidance of ambiguity, you may like to change that '%s' to %r
In which way would there be ambiguity? The first two are integers, the
last two strings.
Thanks for your feedback.
Ce n'est pas parce qu'ils sont nombreux à avoir tort qu'ils ont raison!
More information about the Python-list