RFC: Assignment as expression (pre-PEP)
Steven Bethard
steven.bethard at gmail.com
Thu Apr 5 19:09:43 EDT 2007
Steven Bethard wrote:
> darklord at timehorse.com wrote:
>> On Apr 5, 4:22 pm, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
>>> Can you come up with a real example where this happens and which
>>> cannot be
>>> easily rewritten to provide better, clearer code without the
>>> indentation?
>>>
>>> I'll admit to having occasionally had code not entirely dissimilar to
>>> this
>>> when first written, but I don't believe it has ever survived more than a
>>> few minutes before being refactored into a cleaner form. I would
>>> claim that
>>> it is a good thing that Python makes it obvious that code like this
>>> should
>>> be refactored.
>>
>> I am trying to write a parser for a text string. Specifically, I am
>> trying to take a filename that contains meta-data about the content of
>> the A/V file (mpg, mp3, etc.).
>>
>> I first split the filename into fields separated by spaces and dots.
>>
>> Then I have a series of regular expression matches. I like
>> Cartesian's 'event-based' parser approach though the even table gets a
>> bit unwieldy as it grows. Also, I would prefer to have the 'action'
>> result in a variable assignment specific to the test. E.g.
>>
>> def parseName(name):
>> fields = sd.split(name)
>> fields, ext = fields[:-1], fields[-1]
>> year = ''
>> capper = ''
>> series = None
>> episodeNum = None
>> programme = ''
>> episodeName = ''
>> past_title = false
>> for f in fields:
>> if year_re.match(f):
>> year = f
>> past_title = True
>> else:
>> my_match = capper_re.match(f):
>> if my_match:
>> capper = capper_re.match(f).group(1)
>> if capper == 'JJ' or capper == 'JeffreyJacobs':
>> capper = 'Jeffrey C. Jacobs'
>> past_title = True
>> else:
>> my_match = epnum_re.match(f):
>> if my_match:
>> series, episodeNum = my_match.group('series',
>> 'episode')
>> past_title = True
>> else:
>> # If I think of other parse elements, they go
>> here.
>> # Otherwise, name is part of a title; check for
>> capitalization
>> if f[0] >= 'a' and f[0] <= 'z' and f not in
>> do_not_capitalize:
>> f = f.capitalize()
>> if past_title:
>> if episodeName: episodeName += ' '
>> episodeName += f
>> else:
>> if programme: programme += ' '
>> programme += f
>>
>> return programme, series, episodeName, episodeNum, year, capper,
>> ext
>
> Why can't you combine your regular expressions into a single expression,
> e.g. something like::
>
> >>> exp = r'''
> ... (?P<year>\d{4})
> ... |
> ... by\[(?P<capper>.*)\]
> ... |
> ... S(?P<series>\d\d)E(?P<episode>\d\d)
> ... '''
> >>> matcher = re.compile(exp, re.VERBOSE)
> >>> matcher.match('1990').groupdict()
> {'series': None, 'capper': None, 'episode': None, 'year': '1990'}
> >>> matcher.match('by[Jovev]').groupdict()
> {'series': None, 'capper': 'Jovev', 'episode': None, 'year': None}
> >>> matcher.match('S01E12').groupdict()
> {'series': '01', 'capper': None, 'episode': '12', 'year': None}
>
> Then your code above would look something like::
>
> for f in fields:
> match = matcher.match(f)
> if match is not None:
> year = match.group('year')
> capper = match.group('capper')
> if capper == 'JJ' or capper == 'JeffreyJacobs':
> capper = 'Jeffrey C. Jacobs'
> series = match.group('series')
> episodeNum = match.group('episode')
> past_title = True
I guess you need to be a little more careful here not to overwrite your
old values, e.g. something like::
year = match.group('year') or year
capper = match.group('capper') or capper
...
STeVe
More information about the Python-list
mailing list