regexps to objects
Peter Otten
__peter__ at web.de
Fri Jul 27 06:24:41 EDT 2012
andrea crotti wrote:
> I have some complex input to parse (with regexps), and I would like to
> create nice objects directy from them.
> The re module doesn't of course try to conver to any type, so I was
> playing around to see if it's worth do something as below, where I
> assign a constructor to every regexp and build an object from the
> result..
>
> Do you think it makes sense in general or how do you cope with this
> problem?
>
> import re
> from time import strptime
> TIME_FORMAT_INPUT = '%m/%d/%Y %H:%M:%S'
>
> def time_string_to_obj(timestring):
> return strptime(timestring, TIME_FORMAT_INPUT)
>
>
> REGEXPS = {
> 'num': ('\d+', int),
> 'date': ('[0-9/]+ [0-9:]+', time_string_to_obj),
> }
>
>
> def reg_to_obj(reg, st):
> reg, constr = reg
> found = re.match(reg, st)
> return constr(found.group())
>
>
> if __name__ == '__main__':
> print reg_to_obj(REGEXPS['num'], '100')
> print reg_to_obj(REGEXPS['date'], '07/24/2012 06:23:13')
There is an undocumented Scanner class in the re module:
>>> from datetime import datetime
>>> from re import Scanner
>>> sc = Scanner([
... ("[0-9/]+ [0-9:]+", lambda self, s: datetime.strptime(s, "%m/%d/%Y %H:
%M:%S")),
... (r"\d+", lambda self, s: int(s)),
... ("\s+", lambda self, s: None)])
>>> sc.scan("07/24/2012 06:23:13")
([datetime.datetime(2012, 7, 24, 6, 23, 13)], '')
>>> sc.scan("07/24/2012 06:23:13 123")
([datetime.datetime(2012, 7, 24, 6, 23, 13), 123], '')
However:
>>> sc.scan("456 07/24/2012 06:23:13 123")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/re.py", line 322, in scan
action = action(self, m.group())
File "<stdin>", line 2, in <lambda>
File "/usr/lib/python2.7/_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data '456 07' does not match format '%m/%d/%Y %H:%M:%S'
More information about the Python-list
mailing list