<div dir="ltr">Right.<div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jun 6, 2015 at 1:52 PM, Ryan Gonzalez <span dir="ltr"><<a href="mailto:rymg19@gmail.com" target="_blank">rymg19@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>

<br>

On June 6, 2015 12:29:21 AM CDT, Neil Girdhar <<a href="mailto:mistersheik@gmail.com">mistersheik@gmail.com</a>> wrote:<br>

>On Sat, Jun 6, 2015 at 1:00 AM, Nick Coghlan <<a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a>><br>

>wrote:<br>

><br>

>> On 6 June 2015 at 12:21, Neil Girdhar <<a href="mailto:mistersheik@gmail.com">mistersheik@gmail.com</a>> wrote:<br>

>> > I'm curious what other people will contribute to this discussion as<br>

>I<br>

>> think<br>

>> > having no great parsing library is a huge hole in Python.  Having<br>

>one<br>

>> would<br>

>> > definitely allow me to write better utilities using Python.<br>

>><br>

>> The design of *Python's* grammar is deliberately restricted to being<br>

>> parsable with an LL(1) parser. There are a great many static analysis<br>

>> and syntax highlighting tools that are able to take advantage of that<br>

>> simplicity because they only care about the syntax, not the full<br>

>> semantics.<br>

>><br>

><br>

>Given the validation that happens, it's not actually LL(1) though.<br>

>It's<br>

>mostly LL(1) with some syntax errors that are raised for various<br>

>illegal<br>

>constructs.<br>

><br>

>Anyway, no one is suggesting changing the grammar.<br>

><br>

><br>

>> Anyone actually doing their *own* parsing of something else *in*<br>

>> Python, would be better advised to reach for PLY<br>

>> (<a href="https://pypi.python.org/pypi/ply" target="_blank">https://pypi.python.org/pypi/ply</a> ). PLY is the parser underlying<br>

>> <a href="https://pypi.python.org/pypi/pycparser" target="_blank">https://pypi.python.org/pypi/pycparser</a>, and hence the highly regarded<br>

>> CFFI library, <a href="https://pypi.python.org/pypi/cffi" target="_blank">https://pypi.python.org/pypi/cffi</a><br>

>><br>

>> Other notable parsing alternatives folks may want to look at include<br>

>> <a href="https://pypi.python.org/pypi/lrparsing" target="_blank">https://pypi.python.org/pypi/lrparsing</a> and<br>

>> <a href="http://pythonhosted.org/pyparsing/" target="_blank">http://pythonhosted.org/pyparsing/</a> (both of which allow you to use<br>

>> Python code to define your grammar, rather than having to learn a<br>

>> formal grammar notation).<br>

>><br>

>><br>

>I looked at ply and pyparsing, but it was impossible to simply parse<br>

>LaTeX<br>

>because I couldn't explain to suck up the right number of arguments<br>

>given<br>

>the name of the function.  When it sees a function, it learns how many<br>

>arguments that function needs.  When it sees a function call<br>

>\a{1}{2}{3},<br>

>if "\a" takes 2 arguments, then it should only suck up 1 and 2 as<br>

>arguments, and leave 3 as a regular text token. In other words, I<br>

>should be<br>

>able to tell the parser what to expect in code that lives on the rule<br>

>edges.<br>

<br>

</div></div>Can't you just hack it into the lexer? When the slash is detected, the lexer can treat the following identifier as a function, look up the number of required arguments, and push it onto some sort of stack. Whenever a left bracket is encountered and another argument is needed by the TOS, it returns a special argument opener token.<br></blockquote><div><br></div><div>Your solution is right, but I would implement it in the parser since I want that kind of generic functionality of dynamic grammar rules to be available everywhere.</div><div>   </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class=""><br>

><br>

>The parsing tools you listed work really well until you need to do<br>

>something like (1) the validation step that happens in Python, or (2)<br>

>figuring out exactly where the syntax error is (line and column number)<br>

>or<br>

>(3) ensuring that whitespace separates some tokens even when it's not<br>

>required to disambiguate different parse trees.  I got the impression<br>

>that<br>

>they wanted to make these languages simple for the simple cases, but<br>

>they<br>

>were made too simple and don't allow you to do everything in one simple<br>

>pass.<br>

><br>

>Best,<br>

><br>

>Neil<br>

><br>

><br>

>> Regards,<br>

>> Nick.<br>

>><br>

>> --<br>

>> Nick Coghlan   |   <a href="mailto:ncoghlan@gmail.com">ncoghlan@gmail.com</a>   |   Brisbane, Australia<br>

>><br>

><br>

><br>

</span>>------------------------------------------------------------------------<br>

<span class="im HOEnZb">><br>

>_______________________________________________<br>

>Python-ideas mailing list<br>

><a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

</span><span class="im HOEnZb">><a href="https://mail.python.org/mailman/listinfo/python-ideas" target="_blank">https://mail.python.org/mailman/listinfo/python-ideas</a><br>

>Code of Conduct: <a href="http://python.org/psf/codeofconduct/" target="_blank">http://python.org/psf/codeofconduct/</a><br>

<br>

--<br>

</span><div class="HOEnZb"><div class="h5">Sent from my Android device with K-9 Mail. Please excuse my brevity.<br>

</div></div></blockquote></div><br></div></div>