[Python-Dev] IDLE colorizer
Guido van Rossum
guido at python.org
Mon Apr 2 15:01:49 EDT 2018
Heh. The good old manual approach. :-) How bad indeed?
>>> from idlelib import colorizer; colorizer.make_pat()
from idlelib import colorizer; colorizer.make_pat()
'\\b(?P<KEYWORD>False|None|True|and|as|assert|break|class|continue|def|del|elif|else|except|finally|for|from|global|if|import|in|is|lambda|nonlocal|not|or|pass|raise|return|try|while|with|yield)\\b|([^.\'\\"\\\\#]\\b|^)(?P<BUILTIN>ArithmeticError|AssertionError|AttributeError|BaseException|BlockingIOError|BrokenPipeError|BufferError|BytesWarning|ChildProcessError|ConnectionAbortedError|ConnectionError|ConnectionRefusedError|ConnectionResetError|DeprecationWarning|EOFError|Ellipsis|EnvironmentError|Exception|FileExistsError|FileNotFoundError|FloatingPointError|FutureWarning|GeneratorExit|IOError|ImportError|ImportWarning|IndentationError|IndexError|InterruptedError|IsADirectoryError|KeyError|KeyboardInterrupt|LookupError|MemoryError|ModuleNotFoundError|NameError|NotADirectoryError|NotImplemented|NotImplementedError|OSError|OverflowError|PendingDeprecationWarning|PermissionError|ProcessLookupError|RecursionError|ReferenceError|ResourceWarning|RuntimeError|RuntimeWarning|StopAsyncIteration|StopIteration|SyntaxError|SyntaxWarning|SystemError|SystemExit|TabError|TimeoutError|TypeError|UnboundLocalError|UnicodeDecodeError|UnicodeEncodeError|UnicodeError|UnicodeTranslateError|UnicodeWarning|UserWarning|ValueError|Warning|ZeroDivisionError|abs|all|any|ascii|bin|bool|bytearray|bytes|callable|chr|classmethod|compile|complex|copyright|credits|delattr|dict|dir|divmod|enumerate|eval|exec|exit|filter|float|format|frozenset|getattr|globals|hasattr|hash|help|hex|id|input|int|isinstance|issubclass|iter|len|license|list|locals|map|max|memoryview|min|next|object|oct|open|ord|pow|print|property|quit|range|repr|reversed|round|set|setattr|slice|sorted|staticmethod|str|sum|super|tuple|type|vars|zip)\\b|(?P<COMMENT>#[^\\n]*)|(?P<STRING>(?i:\\br|u|f|fr|rf|b|br|rb)?\'\'\'[^\'\\\\]*((\\\\.|\'(?!\'\'))[^\'\\\\]*)*(\'\'\')?|(?i:\\br|u|f|fr|rf|b|br|rb)?"""[^"\\\\]*((\\\\.|"(?!""))[^"\\\\]*)*(""")?|(?i:\\br|u|f|fr|rf|b|br|rb)?\'[^\'\\\\\\n]*(\\\\.[^\'\\\\\\n]*)*\'?|(?i:\\br|u|f|fr|rf|b|br|rb)?"[^"\\\\\\n]*(\\\\.[^"\\\\\\n]*)*"?)|(?P<SYNC>\\n)'
>>>
On Mon, Apr 2, 2018 at 11:32 AM, MRAB <python at mrabarnett.plus.com> wrote:
> On 2018-04-02 05:43, Guido van Rossum wrote:
>
>> My question for you: how on earth did you find this?! Speaking of a
>> needle in a haystack. Did you run some kind of analysis program that looks
>> for regexprs? (We've received some good reports from someone who did that
>> looking for possible DoS attacks.)
>>
>> The thread was about string prefixes.
>
> Terry Reedy wrote "IDLE's colorizer does its parsing with a giant regex."
>
> I wondered: "How bad could it be?" (It's smaller now that the IGNORECASE
> flag can have a local scope.)
>
> It wasn't hard to find because it was in a file called "colorizer.py" in a
> folder called "idlelib".
>
>
> On Sun, Apr 1, 2018 at 6:49 PM, MRAB <python at mrabarnett.plus.com <mailto:
>> python at mrabarnett.plus.com>> wrote:
>>
>> A thread on python-ideas is talking about the prefixes of string
>> literals, and the regex used in IDLE.
>>
>> Line 25 of Lib\idlelib\colorizer.py is:
>>
>> stringprefix = r"(?i:\br|u|f|fr|rf|b|br|rb)?"
>>
>> which looks slightly wrong to me.
>>
>> The \b will apply only to the first choice.
>>
>> Shouldn't it be more like:
>>
>> stringprefix = r"(?:\b(?i:r|u|f|fr|rf|b|br|rb))?"
>>
>> ?
>>
>>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%
> 40python.org
>
--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180402/8fbc6f8f/attachment.html>
More information about the Python-Dev
mailing list