[Python-ideas] Exposing regular expression bytecode

Jonathan Goble jcgoble3 at gmail.com
Tue Feb 16 13:29:33 EST 2016


OK, thanks for the comments, everyone. I'm glad to hear that people
generally think this is a useful idea.

Some specific replies:

On Tue, Feb 16, 2016 at 4:22 AM, Chris Angelico <rosuav at gmail.com> wrote:
> For what it's worth, I read your post with interest, but didn't have
> anything substantive to reply - mainly because I don't use regexes
> much. But it would be rather cool to be able to decompile a regex.
> Imagine a regex pretty-printer: compile an arbitrary string, and if
> it's a valid regex, decompile it to a valid source code form, using
> re.VERBOSE. That could help _hugely_ with debugging, if the trick can
> be pulled off.
>
> ChrisA

That's exactly the type of tools I envision being made available by
third parties. Depending on how much I get invested into this project,
I may even write such a tool myself (though that's not guaranteed).

On Tue, Feb 16, 2016 at 4:55 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> Sorry. I don't personally have any issue with the proposal, and it
> sounds like a reasonable idea. I don't think it's likely to be
> *hugely* controversial - although it will likely need a little care in
> documenting the feature to ensure that we are clear that there's no
> guarantees of backward compatibility that we don't want to commit to
> on the newly - exposed data. And we should also ensure that by
> exposing this information, we don't preclude changes such as the
> incorporation of the regex module (I don't know if the regex module
> has a bytecode implementation like the re module does).

The regex implementation is indeed something I would need to
investigate here, and will do so before I go too far.

> The next step is probably simply to raise a tracker issue for this. I
> know you said you have little C experience, but the big problem is
> that it's unlikely that any of the core devs with C experience will
> have the time or motivation to code up your idea. So without a working
> patch, and someone willing and able to respond to comments on the
> patch, it's not likely to progress.

Tracker issue is already filed: http://bugs.python.org/issue26336 I
actually filed the issue before I realized that the mailing lists were
a better place to discuss it.

> But if you are willing to dig into Python's C API yourself (and it
> sounds like you are) there are definitely people who will help you.
> You might want to join the core mentorship list (see
> http://pythonmentors.com/) where you should get plenty of assistance.
> This proposal sounds like a great "beginner" task, as well - so even
> if you don't want to implement it yourself, still put it on the
> tracker, and mark it as an "easy" change, and maybe some other
> newcomer who wants a task to help them learn the C API will pick it
> up.

I'll look into the mentorship list; thanks for the link. As for
marking it "easy", I don't seem to have the necessary permissions to
change the Keywords field; perhaps you or someone else can set that
flag for me? If so, I'd appreciate it. :-)

> Hope that helps - thanks for the suggestion and sorry if it seems like
> no-one was interested at first. It's an unfortunate fact of life
> around here that things *do* take time to get people's interest. You
> mention patience in one of your messages - that's definitely something
> you'll need to cultivate, I'm afraid... :-)

Patience is something I've been working on since I was a little kid.
I'm 29 years old now, and it still eludes me from time to time. But
yes, it's something I'll have to work on. :-P

Also, I received a small patch off-list from Petr Viktorin
implementing a getter for the code list (thanks, Petr). I'll need to
test it, but from the little I know of the C API it looks like it will
get me started in the right direction. Assuming that works, what's
left is a public constructor for the regex type (to enable
optimizers), a dis-like module, and docs and tests. I don't think this
would be major enough to require a PEP, but of course being new here,
I'm open to being told I'm wrong. :-)


More information about the Python-ideas mailing list