RFC: Assignment as expression (pre-PEP)
darklord@timehorse.com
TimeHorse at gmail.com
Thu Apr 5 17:08:46 EDT 2007
On Apr 5, 4:22 pm, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
> Can you come up with a real example where this happens and which cannot be
> easily rewritten to provide better, clearer code without the indentation?
>
> I'll admit to having occasionally had code not entirely dissimilar to this
> when first written, but I don't believe it has ever survived more than a
> few minutes before being refactored into a cleaner form. I would claim that
> it is a good thing that Python makes it obvious that code like this should
> be refactored.
I am trying to write a parser for a text string. Specifically, I am
trying to take a filename that contains meta-data about the content of
the A/V file (mpg, mp3, etc.).
I first split the filename into fields separated by spaces and dots.
Then I have a series of regular expression matches. I like
Cartesian's 'event-based' parser approach though the even table gets a
bit unwieldy as it grows. Also, I would prefer to have the 'action'
result in a variable assignment specific to the test. E.g.
def parseName(name):
fields = sd.split(name)
fields, ext = fields[:-1], fields[-1]
year = ''
capper = ''
series = None
episodeNum = None
programme = ''
episodeName = ''
past_title = false
for f in fields:
if year_re.match(f):
year = f
past_title = True
else:
my_match = capper_re.match(f):
if my_match:
capper = capper_re.match(f).group(1)
if capper == 'JJ' or capper == 'JeffreyJacobs':
capper = 'Jeffrey C. Jacobs'
past_title = True
else:
my_match = epnum_re.match(f):
if my_match:
series, episodeNum = my_match.group('series',
'episode')
past_title = True
else:
# If I think of other parse elements, they go
here.
# Otherwise, name is part of a title; check for
capitalization
if f[0] >= 'a' and f[0] <= 'z' and f not in
do_not_capitalize:
f = f.capitalize()
if past_title:
if episodeName: episodeName += ' '
episodeName += f
else:
if programme: programme += ' '
programme += f
return programme, series, episodeName, episodeNum, year, capper,
ext
Now, the problem with this code is that it assumes only 2 pieces of
free-form meta-data in the name (i.e. Programme Name and Episode
Name). Also, although this is not directly adaptable to Cartesian's
approach, you COULD rewrite it using a dictionary in the place of
local variable names so that the event lookup could consist of 3
properties per event: compiled_re, action_method, dictionary_string.
But even with that, in the case of the epnum match, two assignments
are required so perhaps a convoluted scheme such that if
dictionary_string is a list, for each of the values returned by
action_method, bind the result to the corresponding ith dictionary
element named in dictionary_string, which seems a bit convoluted. And
the fall-through case is state-dependent since the 'unrecognized
field' should be shuffled into a different variable dependent on
state. Still, if there is a better approach I am certainly up for
it. I love event-based parsers so I have no problem with that
approach in general.
More information about the Python-list
mailing list