[Python-ideas] Hooking between lexer and parser

Ryan Gonzalez rymg19 at gmail.com
Sat Jun 6 20:31:46 CEST 2015



On June 6, 2015 1:27:14 PM CDT, Neil Girdhar <mistersheik at gmail.com> wrote:
>Right.
>
>On Sat, Jun 6, 2015 at 1:52 PM, Ryan Gonzalez <rymg19 at gmail.com> wrote:
>
>>
>>
>> On June 6, 2015 12:29:21 AM CDT, Neil Girdhar <mistersheik at gmail.com>
>> wrote:
>> >On Sat, Jun 6, 2015 at 1:00 AM, Nick Coghlan <ncoghlan at gmail.com>
>> >wrote:
>> >
>> >> On 6 June 2015 at 12:21, Neil Girdhar <mistersheik at gmail.com>
>wrote:
>> >> > I'm curious what other people will contribute to this discussion
>as
>> >I
>> >> think
>> >> > having no great parsing library is a huge hole in Python. 
>Having
>> >one
>> >> would
>> >> > definitely allow me to write better utilities using Python.
>> >>
>> >> The design of *Python's* grammar is deliberately restricted to
>being
>> >> parsable with an LL(1) parser. There are a great many static
>analysis
>> >> and syntax highlighting tools that are able to take advantage of
>that
>> >> simplicity because they only care about the syntax, not the full
>> >> semantics.
>> >>
>> >
>> >Given the validation that happens, it's not actually LL(1) though.
>> >It's
>> >mostly LL(1) with some syntax errors that are raised for various
>> >illegal
>> >constructs.
>> >
>> >Anyway, no one is suggesting changing the grammar.
>> >
>> >
>> >> Anyone actually doing their *own* parsing of something else *in*
>> >> Python, would be better advised to reach for PLY
>> >> (https://pypi.python.org/pypi/ply ). PLY is the parser underlying
>> >> https://pypi.python.org/pypi/pycparser, and hence the highly
>regarded
>> >> CFFI library, https://pypi.python.org/pypi/cffi
>> >>
>> >> Other notable parsing alternatives folks may want to look at
>include
>> >> https://pypi.python.org/pypi/lrparsing and
>> >> http://pythonhosted.org/pyparsing/ (both of which allow you to use
>> >> Python code to define your grammar, rather than having to learn a
>> >> formal grammar notation).
>> >>
>> >>
>> >I looked at ply and pyparsing, but it was impossible to simply parse
>> >LaTeX
>> >because I couldn't explain to suck up the right number of arguments
>> >given
>> >the name of the function.  When it sees a function, it learns how
>many
>> >arguments that function needs.  When it sees a function call
>> >\a{1}{2}{3},
>> >if "\a" takes 2 arguments, then it should only suck up 1 and 2 as
>> >arguments, and leave 3 as a regular text token. In other words, I
>> >should be
>> >able to tell the parser what to expect in code that lives on the
>rule
>> >edges.
>>
>> Can't you just hack it into the lexer? When the slash is detected,
>the
>> lexer can treat the following identifier as a function, look up the
>number
>> of required arguments, and push it onto some sort of stack. Whenever
>a left
>> bracket is encountered and another argument is needed by the TOS, it
>> returns a special argument opener token.
>>
>
>Your solution is right, but I would implement it in the parser since I
>want
>that kind of generic functionality of dynamic grammar rules to be
>available
>everywhere.
>

Unless the parsing library doesn't support that. Like PLY. I believe pycparser also uses the lexer to manage type names.

>
>>
>> >
>> >The parsing tools you listed work really well until you need to do
>> >something like (1) the validation step that happens in Python, or
>(2)
>> >figuring out exactly where the syntax error is (line and column
>number)
>> >or
>> >(3) ensuring that whitespace separates some tokens even when it's
>not
>> >required to disambiguate different parse trees.  I got the
>impression
>> >that
>> >they wanted to make these languages simple for the simple cases, but
>> >they
>> >were made too simple and don't allow you to do everything in one
>simple
>> >pass.
>> >
>> >Best,
>> >
>> >Neil
>> >
>> >
>> >> Regards,
>> >> Nick.
>> >>
>> >> --
>> >> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
>> >>
>> >
>> >
>>
>>------------------------------------------------------------------------
>> >
>> >_______________________________________________
>> >Python-ideas mailing list
>> >Python-ideas at python.org
>> >https://mail.python.org/mailman/listinfo/python-ideas
>> >Code of Conduct: http://python.org/psf/codeofconduct/
>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


More information about the Python-ideas mailing list