substitution
Iain King
iainking at gmail.com
Mon Jan 18 08:22:37 EST 2010
On Jan 18, 12:41 pm, Iain King <iaink... at gmail.com> wrote:
> On Jan 18, 10:21 am, superpollo <ute... at esempio.net> wrote:
>
>
>
> > superpollo ha scritto:
>
> > > hi.
>
> > > what is the most pythonic way to substitute substrings?
>
> > > eg: i want to apply:
>
> > > foo --> bar
> > > baz --> quux
> > > quuux --> foo
>
> > > so that:
>
> > > fooxxxbazyyyquuux --> barxxxquuxyyyfoo
>
> > > bye
>
> > i explain better:
>
> > say the subs are:
>
> > quuux --> foo
> > foo --> bar
> > baz --> quux
>
> > then i cannot apply the subs in sequence (say, .replace() in a loop),
> > otherwise:
>
> > fooxxxbazyyyquuux --> fooxxxbazyyyfoo --> barxxxbazyyybar -->
> > barxxxquuxyyybar
>
> > not as intended...
>
> Not sure if it's the most pythonic, but I'd probably do it like this:
>
> def token_replace(string, subs):
> subs = dict(subs)
> tokens = {}
> for i, sub in enumerate(subs):
> tokens[sub] = i
> tokens[i] = sub
> current = [string]
> for sub in subs:
> new = []
> for piece in current:
> if type(piece) == str:
> chunks = piece.split(sub)
> new.append(chunks[0])
> for chunk in chunks[1:]:
> new.append(tokens[sub])
> new.append(chunk)
> else:
> new.append(piece)
> current = new
> output = []
> for piece in current:
> if type(piece) == str:
> output.append(piece)
> else:
> output.append(subs[tokens[piece]])
> return ''.join(output)
>
> >>> token_replace("fooxxxbazyyyquuux", [("quuux", "foo"), ("foo", "bar"), ("baz", "quux")])
>
> 'barxxxquuxyyyfoo'
>
> I'm sure someone could whittle that down to a handful of list comps...
> Iain
Slightly better (lets you have overlapping search strings, used in the
order they are fed in):
def token_replace(string, subs):
tokens = {}
if type(subs) == dict:
for i, sub in enumerate(subs):
tokens[sub] = i
tokens[i] = subs[sub]
else:
s = []
for i, (k,v) in enumerate(subs):
tokens[k] = i
tokens[i] = v
s.append(k)
subs = s
current = [string]
for sub in subs:
new = []
for piece in current:
if type(piece) == str:
chunks = piece.split(sub)
new.append(chunks[0])
for chunk in chunks[1:]:
new.append(tokens[sub])
new.append(chunk)
else:
new.append(piece)
current = new
output = []
for piece in current:
if type(piece) == str:
output.append(piece)
else:
output.append(tokens[piece])
return ''.join(output)
More information about the Python-list
mailing list