parsing an Excel formula with the re module

Steve Holden steve at
Wed Jan 6 03:06:01 CET 2010

Tim Chase wrote:
> vsoler wrote:
>> Hence, I need to parse Excel formulas. Can I do it by means only of re
>> (regular expressions)?
>> I know that for simple formulas such as "=3*A7+5" it is indeed
>> possible. What about complex for formulas that include functions,
>> sheet names and possibly other *.xls files?
> Where things start getting ugly is when you have nested function calls,
> such as
>   =if(Sum(A1:A25)>42,Min(B1:B25), if(Sum(C1:C25)>3.14,
> (Min(C1:C25)+3)*18,Max(B1:B25)))
> Regular expressions don't do well with nested parens (especially
> arbitrarily-nesting-depth such as are possible), so I'd suggest going
> for a full-blown parsing solution like pyparsing.
> If you have fair control over what can be contained in the formulas and
> you know they won't contain nested parens/functions, you might be able
> to formulate some sort of "kinda, sorta, maybe parses some forms of
> formulas" regexp.
And don't forget about named ranges, which can reference cells without
using anything but a plain identifier ...

Steve Holden           +1 571 484 6266   +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010
Holden Web LLC       

More information about the Python-list mailing list