Regular expression for "key = value" pairs

Mark Wooding mdw at distorted.org.uk
Wed Dec 22 11:53:43 EST 2010


Ciccio <franapoli at gmail.com> writes:

> suppose I have:
>
> s='a=b, c=d'
>
> and I want to extract sub-strings a,b,c and d from s (and in general
> from any longer list of such comma separated pairs).
[...]
> In [12]: re.findall(r'(.+)=(.+)', s)
> Out[12]: [('a=b, c', 'd')]

I think there are two logically separate jobs here: firstly, extracting
the comma-separated pairs, and secondly parsing the individual pairs.

If you want the extra problem of dealing with regular expressions, this
seems to be the way to do it.

        R_PAIR = re.compile(r'''
                ^\s*
                ([^=\s]|[^=\s][^=]*[^=\s])
                \s*=\s*
                (\S|\S.*\S)
                \s*$
        ''', re.X)

        def parse_pair(pair):
          m = R_PAIR.match(pair)
          if not m:
            raise ValueError, 'not a `KEY = VALUE\' pair'
          return m.groups([1, 2])

The former is even easier.

        R_COMMA = re.compile(r'\s*,\s*')

        kvs = [parse_pair(p) for p in R_COMMA.split(string)]

Apply gold-plating to taste.

But actually, it's much easier to avoid messing with regular expressions
at all.

        def parse_pair(pair):
          eq = pair.index('=')
          return pair[:eq].strip(), pair[eq + 1:].strip()

        kvs = [parse_pair(p) for p in string.split(',')]

-- [mdw]



More information about the Python-list mailing list