I think we're down to quibbling over the meaning of "virtually" here. I recognize it's "not all" and you recognize it's "most" windows paths are grammatically regexen.

So is it 80%, 95%, 99.9%? And do we mean "paths found in the wild" or "paths as systematically enumerated" from possibility space?

On Oct 26, 2015 3:23 PM, "Andrew Barnert" <abarnert@yahoo.com> wrote:
On Oct 26, 2015, at 14:53, David Mertz <mertz@gnosis.cx> wrote:

Obviously there can't be a regex to exclude everything that isn't a regex. Parentheses can nest to unlimited depths, so you need a formal grammar.

As I said:
Obviously there's no actual regular expression that matches all regular expressions (you can't handle matched brackets without recursion or some other extension)

But you can do it trivially with Perl, or with the Regex module for Python, e.g., just by sticking a "(?1)" inside a pair of escaped parens plus a negative lookahead or nongreedy repetition. I'm not sure exactly how powerful Python (re module) regexes are (if I want to match something that isn't a regular language, I tend to reach for or build a dedicated parser rather than try to stretch re), but I know they're somewhere between actual regular expressions and perl regexes.

But virtually everything that is a Windows path is also formally grammatical regex too (as are many things with no plausible likely intention as such)

That's not true. You can, for example, have unclosed brackets or parens in a Windows path. And if you're wondering why anyone would do that, consider MP3 files auto-named based on their ID3v1/FreeDB metadata, which truncates fields at 29 or 30 bytes.

Anyway, as I said in the same message, it wouldn't be a useful heuristic because there's so much overlap, but you don't need to exaggerate that to make the same point.

On Oct 26, 2015 2:44 PM, "Ben Finney" <ben+python@benfinney.id.au> wrote:
Andrew Barnert via Python-ideas
<python-ideas@python.org> writes:

> Just for fun: is there a Python regex that matches all valid Python
> regexes?

Yes: ‘.*’ matches all valid Python regexes.

> Obviously there's no actual regular expression that matches all
> regular expressions (you can't handle matched brackets without
> recursion or some other extension).

You seem to be seeking something else: a pattern that matches all valid
regex patterns, *and* will never match any string that is not a valid
regex pattern. The latter is rather more difficult.

--
 \      “When I was born I was so surprised I couldn't talk for a year |
  `\                                        and a half.” —Gracie Allen |
_o__)                                                                  |
Ben Finney

_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/