parsing an Excel formula with the re module
Steve Holden
steve at holdenweb.com
Tue Jan 5 21:06:01 EST 2010
Tim Chase wrote:
> vsoler wrote:
>> Hence, I need to parse Excel formulas. Can I do it by means only of re
>> (regular expressions)?
>>
>> I know that for simple formulas such as "=3*A7+5" it is indeed
>> possible. What about complex for formulas that include functions,
>> sheet names and possibly other *.xls files?
>
> Where things start getting ugly is when you have nested function calls,
> such as
>
> =if(Sum(A1:A25)>42,Min(B1:B25), if(Sum(C1:C25)>3.14,
> (Min(C1:C25)+3)*18,Max(B1:B25)))
>
> Regular expressions don't do well with nested parens (especially
> arbitrarily-nesting-depth such as are possible), so I'd suggest going
> for a full-blown parsing solution like pyparsing.
>
> If you have fair control over what can be contained in the formulas and
> you know they won't contain nested parens/functions, you might be able
> to formulate some sort of "kinda, sorta, maybe parses some forms of
> formulas" regexp.
>
And don't forget about named ranges, which can reference cells without
using anything but a plain identifier ...
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010 http://us.pycon.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS: http://holdenweb.eventbrite.com/
More information about the Python-list
mailing list