parsing an Excel formula with the re module

MRAB python at mrabarnett.plus.com
Tue Jan 5 14:20:09 EST 2010


vsoler wrote:
> On 5 ene, 19:35, MRAB <pyt... at mrabarnett.plus.com> wrote:
>> vsoler wrote:
>>> Hello,
>>> I am acessing an Excel file by means of Win 32 COM technology.
>>> For a given cell, I am able to read its formula. I want to make a map
>>> of how cells reference one another, how different sheets reference one
>>> another, how workbooks reference one another, etc.
>>> Hence, I need to parse Excel formulas. Can I do it by means only of re
>>> (regular expressions)?
>>> I know that for simple formulas such as "=3*A7+5" it is indeed
>>> possible. What about complex for formulas that include functions,
>>> sheet names and possibly other *.xls files?
>>> For example    "=Book1!A5+8" should be parsed into ["=","Book1", "!",
>>> "A5","+","8"]
>>> Can anybody help? Any suggestions?
>> Do you mean "how" or do you really mean "whether", ie, get a list of the
>> other cells that are referred to by a certain cell, for example,
>> "=3*A7+5" should give ["A7"] and "=Book1!A5+8" should give ["Book1!A5"]?
> 
> I'd like to know how to do it, should it be possible.
> 
Something like this should work:

     references = re.findall(r"\b((?:\w+!)?[A-Za-z]+\d+)\b", formula)



More information about the Python-list mailing list