[IronPython] differences in IronPython/CPython regular expressions?

George Silva georger.silva at gmail.com
Thu Jun 2 04:28:04 CEST 2011


If youre on Windows, you can test the native c# behvaior with a software
called Rad Software regular expression designer. Its very helpful.

On Wed, Jun 1, 2011 at 8:44 PM, Bill Janssen <janssen at parc.com> wrote:

> Jeff Hardy <jdhardy at gmail.com> wrote:
>
> > On Wed, Jun 1, 2011 at 4:03 PM, Bill Janssen <janssen at parc.com> wrote:
> > > I have a large RE (223613 chars) that works fine in CPython 2.6, but
> >
> > That's truly horrible, but I assume you have a good reason for it.
>
> Hi, Jeff.  Yes, I think so.
>
> > > seems to produce an endless loop in IronPython (see below).  I'm using
> > > Mono 2.10 (.NET 4.0.x) on Ubuntu, with IronPython 2.7.  Anyone have
> > > pointers to the differences between them?  Is
> > > System::Text::RegularExpressions in .NET configurable in some fashion
> > > that might help?
> >
> > First off, is there a reason you don't use re.IGNORECASE? That would
> > cut the regex in half, at least.
>
> Sure.  Names sensitive to capitalization; the rule I'm implementing says
> names are either capitalized or upper-case.
>
> > For the most part, CPython and IronPython regexes should be fairly
> > compatible - IronPython takes the regex and massages it to work with
> > System.Text.RE, but the changes are pretty straightforward and small,
>
> Are those changes documented anywhere?
>
> > and I don't think the re you provided hits any of them. It's quite
> > possible that the Mono version of System.Text.RE can't handle the
> > expression; you could test this saving the full regex and building a
> > small C# program that runs it. The regex template has a lot of
> > potential backtracking in it; are you sure it's not caught in a
> > pathological (exponential) case?
>
> No; all I'm sure of is that this runs in 1.2 seconds in CPython, and
> takes up a core for 15 minutes (till I kill it) with IronPython/Mono.
> Something is clearly hitting a bug somewhere...  I suppose I should
> try it on Windows.
>
> > Finally, is one ginormous really the best way to do this? Have you
> > tried other approaches?
>
> No need, until I hit .NET.  I'm used to working with a full-featured
> finite-state machine (PARC's xfst; see
> http://www.cis.upenn.edu/~cis639/docs/xfst.html), and was wondering if
> we could do similar things with Python's RE machinery.  Long lists like
> these names are often used for lists of companies or cities or such.
> People's names are actually a fairly simple and short example of this :-).
>
> Bill
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
>



-- 
George R. C. Silva

Desenvolvimento em GIS
http://geoprocessamento.net
http://blog.geoprocessamento.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20110601/1a0b53bf/attachment.html>


More information about the Ironpython-users mailing list