Using a function for regular expression substitution
python at
Sun Aug 29 13:14:15 EDT 2010
On 29/08/2010 15:22, naugiedoggie wrote:
> Hello,
> I'm having a problem with using a function as the replacement in
> re.sub().
> Here is the function:
> def normalize(s) :
> return
> urllib.quote(string.capwords(urllib.unquote('provider'))))
This normalises the provider and returns only that, and none of the
remainder of the string.
I think you might want this:
def normalize(s):
return s[ : s.start('provider')] +
urllib.quote(string.capwords(urllib.unquote('provider')))) +
s[s.start('provider') : ]
It returns the part before the provider, followed by the normalised
provider, and then the part after the provider.
> The purpose of this function is to proper-case the words contained in
> a URL query string parameter value. I'm massaging data in web log
> files.
> In case it matters, the regex pattern looks like this:
> provider_pattern = r'(?P<search>Search_Provider)=(?P<provider>[^&]+)'
> The call looks like this:
> <code>
> re.sub(matcher,normalize,line)
> </code>
> Where line is the log line entry.
> What I get back is first the entire line with the normalization of the
> parameter value, but missing the parameter; then appended to that
> string is the entire line again, with the query parameter back in
> place pointing to the normalized string.
> <code>
>>>> fileReader = open(log,'r')
>>>> lines = fileReader.readlines()
>>>> for line in lines:
> if line.find('Search_Type') != -1 and line.find('Search_Provider') !=
> -1 :
These can be replaced by:
if 'Search_Type' in line and 'Search_Provider' in line:
> re.sub(provider_matcher,normalize,line)
re.sub is returning the result, which you're throwing away!
line = re.sub(provider_matcher,normalize,line)
> print line,'\n'
> </code>
> The output of the print is like this:
> <code>
> 'log-entry parameter=value&normalized-string¶meter=value\n
> log-entry parameter=value¶meter=normalized-string¶meter=value'
> </code>
> The goal is to massage the specified entries in the log files and
> write the entire log back into a new file. The new file has to be
> exactly the same as the old one, with the exception of the entries
> I've altered with my function.
> No doubt I'm doing something trivially wrong, but I've tried to
> reproduce the structure as defined in the documentation.
More information about the Python-list
mailing list