[Python-ideas] shlex escapes without Posix mode

Ryan rymg19 at gmail.com
Sat Jul 27 17:05:12 CEST 2013


Sorry, I'm a little rusty at explaining things.

I was using ksh as an example. What I'm really trying to parse is a simple whitespace-delimited language for writing C++ documentation. The basic thing is that Posix mode didn't work right, but I really need those escapes. Normal mode set to split only ay whitespace works great, but having the user write " for every quote is a little annoying. And it's a simple language; PLY and PyParsing are excess overkill.

Also, I used split in the example, but I'm really using the shlex class itself.

Back on the escapes, I still think shlex needs them. If you give me some time, I can write a modified shlex with support for that, but it might take a bit. I also have a more refined concept. Instead of being None, the escape parameter I used in my last example could be an empty string by default. Posix mode has priority over the parameter; if Posix mode is enabled, the parameter will be set to '\\'. If not, the parameter is left alone. That way, shlex wouldn't have to check for None. The feature is off if the string is empty.

Andrew Barnert <abarnert at yahoo.com> wrote:

>On Jul 27, 2013, at 0:11, Ryan <rymg19 at gmail.com> wrote:
>
>> ksh has parenthesis and shell functions. Still a shell language.
>
>And you have to use posix mode with it. Otherwise it'll get quotes
>within words, empty strings, etc. wrong.
>
>Also, posix mode will handle the parentheses the same way as legacy
>mode. The parsing rules don't have any differences with parens. Or Try
>shlex('foo("bar")') with both modes, and call get_token repeatedly and
>see. So parens aren't relevant, and you aren't trying to do anything
>with ksh you wouldn't do the same way with sh, unless I'm
>misunderstanding you.
>
>It sounds like what you want is the legacy internal-quote-stripping
>mode with posix everything else? But you don't even really want that;
>if you're parsing ksh, you need to handle internal quotes the same way
>that sh does; you just don't want to consider parenthesize arguments
>"internal". And turning off posix mode doesn't do that--it seems to do
>the right thing in trivial cases, but not in general.
>
>More importantly, it sounds like you want to parse parens, which means
>you really need to use a shlex instance manually rather than calling
>split. For example:
>
>>>> s=shlex.shlex('foo("spam
>eggs" bar)')
>>>> list(iter(s.get_token, None))
>['foo', '(', 'spam eggs', 'bar', ')']
>
>Those are the tokens you want, right? There's no way to get that with
>split.
>
>> And shell languages pretty much always have escapes.
>> 
>> Andrew Barnert <abarnert at yahoo.com> wrote:
>>> 
>>> Are you trying to use shlex to parse code for some language other
>than sh or another shell language? It's not meant to be useful for perl
>or C or whatever.
>>> 
>>> A general-purpose quoting, escaping, splitting, and joining module
>that could be configured to handle everything from sh to C to CSV could
>be cool, but shlex isn't it.
>>> 
>>> On Jul 26, 2013, at 18:43, Ryan <rymg19 at gmail.com> wrote:
>>> 
>>>> The main thing is that this:
>>>> 
>>>> ("d")
>>>> 
>>>> In Posix mode gets split into this:
>>>> 
>>>> (d)
>>>> 
>>>> But, say the language has callable functions. I'd have to
>re-shlex.split the line to split the arguments. And, even then, the
>quotes already got destroyed.
>>>> 
>>>> Escapes, however, are useful in practically every language.
>Restricting them to POSIX mode just kills it. And I had tried to see if
>I could implement it myself, but reading source code on Android SL4A is
>absolutely painful. And, whenever I pull up a computer, I always have a
>goal in mind and haven't got a chance to tweak it.
>>>> 
>>>> I've never quite come across a language without some form of
>escapes. And, I can't think of an occasion where I'd use POSIX mode.
>Therefore, in the end, it would end up being better if you could enable
>the escapes individually. POSIX mode would have priority over the
>escape option. The instance could.be created like this:
>>>> 
>>>> lex = shlex.shlex(escape='\\')
>>>> 
>>>> The default value would be None. That would change the value of
>pex.escape to '\\'. If the value is None, escapes are disabled.
>>>> 
>>>> Steven D'Aprano <steve at pearwood.info> wrote:
>>>>> 
>>>>> Hi Ryan, and welcome.
>>>>> 
>>>>> 
>>>>> On 26/07/13 05:22, Ryan wrote:
>>>>>> Note: This is my first post to the mailing list, so I'm not sure
>if I'm doing something wrong or something.
>>>>>> 
>>>>>> I've been playing around with shlex.lately, and I mostly like it,
>but I have an idea.
>>>>>> 
>>>>>> Have an option with the ability to enable certain Posix mode
>features selectively, most particularly character escapes. It could be
>something like, if Posix mode is disabled, the string of escape
>characters is set to empty or None, and assigning a value to it enables
>that feature in non-Posix mode.
>>>>> 
>>>>> 
>>>>> That's a good start, but it's awfully vague. "Something like"?
>Concrete ideas will help. Actual working code is best (although be
>cautious about posting large
>>>>> amounts of code here -- a few lines is fine, pages of code, not so
>much), or at least pseudo-code demonstrating how and when somebody
>might use this proposed feature.
>>>>> 
>>>>> Good use-cases for why you might want the feature also helps.
>Under what circumstances would you say "Well, I don't want POSIX mode,
>but I do want POSIX escape sequences"?
>>>>> 
>>>>> Ultimately, don't be surprised or disappointed at negative
>reactions. Negative reactions are better than silence -- at least it
>means that people have read, and care enough to comment, on your post,
>while silence may mean that nobody cares, or simply don't understand
>what you're talking about and are too polite to say so.
>>>>> 
>>>>> We tend to be rather conservative about adding new features.
>Sometimes it takes *years* for features to be added, or they are never
>added, if nobody who cares about the feature steps up to program it.
>Remember too that new code has to carry its weight: code not only has
>one-off costs (code doesn't
>>>>> write itself, neither does the documentation), but also on-going
>costs (maintenance, bug-fixes, new features for users to learn, etc.),
>and no matter how low that cost is, it is never zero, so if the benefit
>from that feature is not more than the cost, it will probably be
>rejected.
>>>>> 
>>>>> Two good posts you should read, by one of the senior core
>developers, are:
>>>>> 
>>>>>
>http://www.boredomandlaziness.org/2011/04/musings-on-culture-of-python-dev.html
>>>>> 
>>>>>
>http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html
>>>>> 
>>>>> 
>>>>> If you take nothing else from my reply, at least take from it
>these two questions:
>>>>> 
>>>>> "Under what circumstances would this feature be useful to you? And
>would they be useful enough that you personally would program this
>feature, if you had the
>>>>> skills?"
>>>> 
>>>> -- 
>>>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> http://mail.python.org/mailman/listinfo/python-ideas
>> 
>> -- 
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130727/ab61c6ab/attachment-0001.html>


More information about the Python-ideas mailing list