RFC: Assignment as expression (pre-PEP)

darklord@timehorse.com TimeHorse at gmail.com
Thu Apr 5 23:08:46 CEST 2007


On Apr 5, 4:22 pm, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
> Can you come up with a real example where this happens and which cannot be
> easily rewritten to provide better, clearer code without the indentation?
>
> I'll admit to having occasionally had code not entirely dissimilar to this
> when first written, but I don't believe it has ever survived more than a
> few minutes before being refactored into a cleaner form. I would claim that
> it is a good thing that Python makes it obvious that code like this should
> be refactored.

I am trying to write a parser for a text string.  Specifically, I am
trying to take a filename that contains meta-data about the content of
the A/V file (mpg, mp3, etc.).

I first split the filename into fields separated by spaces and dots.

Then I have a series of regular expression matches.  I like
Cartesian's 'event-based' parser approach though the even table gets a
bit unwieldy as it grows.  Also, I would prefer to have the 'action'
result in a variable assignment specific to the test.  E.g.

def parseName(name):
    fields = sd.split(name)
    fields, ext = fields[:-1], fields[-1]
    year = ''
    capper = ''
    series = None
    episodeNum = None
    programme = ''
    episodeName = ''
    past_title = false
    for f in fields:
        if year_re.match(f):
            year = f
            past_title = True
        else:
            my_match = capper_re.match(f):
            if my_match:
                capper = capper_re.match(f).group(1)
                if capper == 'JJ' or capper == 'JeffreyJacobs':
                    capper = 'Jeffrey C. Jacobs'
                past_title = True
            else:
                my_match = epnum_re.match(f):
                if my_match:
                    series, episodeNum = my_match.group('series',
'episode')
                    past_title = True
                else:
                    # If I think of other parse elements, they go
here.
                    # Otherwise, name is part of a title; check for
capitalization
                    if f[0] >= 'a' and f[0] <= 'z' and f not in
do_not_capitalize:
                        f = f.capitalize()
                    if past_title:
                        if episodeName: episodeName += ' '
                        episodeName += f
                    else:
                        if programme: programme += ' '
                        programme += f

    return programme, series, episodeName, episodeNum, year, capper,
ext

Now, the problem with this code is that it assumes only 2 pieces of
free-form meta-data in the name (i.e. Programme Name and Episode
Name).  Also, although this is not directly adaptable to Cartesian's
approach, you COULD rewrite it using a dictionary in the place of
local variable names so that the event lookup could consist of 3
properties per event: compiled_re, action_method, dictionary_string.
But even with that, in the case of the epnum match, two assignments
are required so perhaps a convoluted scheme such that if
dictionary_string is a list, for each of the values returned by
action_method, bind the result to the corresponding ith dictionary
element named in dictionary_string, which seems a bit convoluted.  And
the fall-through case is state-dependent since the 'unrecognized
field' should be shuffled into a different variable dependent on
state.  Still, if there is a better approach I am certainly up for
it.  I love event-based parsers so I have no problem with that
approach in general.




More information about the Python-list mailing list