Tokenizer inconsistency wrt to new lines in comments

Fredrik Lundh fredrik at
Fri Apr 4 22:38:50 CEST 2008

George Sakkis wrote:

>> If it was a bug it has to violate a functional requirement. I can't
>> see which one.
> Perhaps it's not a functional requirement but it came up as a real
> problem on a source colorizer I use. I count on newlines generating
> token.NEWLINE or tokenize.NL tokens in order to produce <br> tags. It
> took me some time and head scratching to find out why some comments
> were joined together with the following line. Now I have to check
> whether a comment ends in new line and if it does output an extra <br>
> tag.. it works but it's a kludge.

well, the real kludge here is of course that you're writing your own 
colorizer, when you can just go and grab Pygments:

or, if you prefer something tiny and self-contained, something like the 
colorizer module in this directory:

(the element_colorizer module in the same directory gives you XHTML in 
an ElementTree instead of raw HTML, if you want to postprocess things)


