Tag parsing in python
Paul McGuire
ptmcg at austin.rr.com
Sun Aug 29 08:43:49 EDT 2010
On Aug 28, 11:23 pm, Paul McGuire <pt... at austin.rr.com> wrote:
> On Aug 28, 11:14 am, agnibhu <dee... at gmail.com> wrote:
>
>
>
>
>
> > Hi all,
>
> > I'm a newbie in python. I'm trying to create a library for parsing
> > certain keywords.
> > For example say I've key words like abc: bcd: cde: like that... So the
> > user may use like
> > abc: How are you bcd: I'm fine cde: ok
>
> > So I've to extract the "How are you" and "I'm fine" and "ok"..and
> > assign them to abc:, bcd: and cde: respectively.. There may be
> > combination of keyowords introduced in future. like abc: xy: How are
> > you
> > So new keywords qualifying the other keywords so on..
I got to thinking more about your keywords-qualifying-keywords
example, and I thought this would be a good way to support locale-
specific tags. I also thought how one might want to have tags within
tags, to be substituted later, requiring a "abc::" escaped form of
"abc:", so that the tag is substituted with the value of tag "abc:" as
a late binding.
Wasn't too hard to modify what I posted yesterday, and now I rather
like it.
-- Paul
# tag_substitute.py
from pyparsing import (Combine, Word, alphas, FollowedBy, Group,
OneOrMore,
empty, SkipTo, LineEnd, Optional, Forward, MatchFirst, Literal,
And, replaceWith)
tag = Combine(Word(alphas) + ~FollowedBy("::") + ":")
tag_defn = Group(OneOrMore(tag))("tag") + empty + SkipTo(tag |
LineEnd())("body") + Optional(LineEnd().suppress())
# now combine macro detection with substitution
macros = {}
macro_substitution = Forward()
def make_macro_sub(tokens):
# unescape '::' and substitute any embedded tags
tag_value =
macro_substitution.transformString(tokens.body.replace("::",":"))
# save this tag and value (or overwrite previous)
macros[tuple(tokens.tag)] = tag_value
# define overall macro substitution expression
macro_substitution << MatchFirst(
[(Literal(k[0]) if len(k)==1
else And([Literal(kk) for kk in
k])).setParseAction(replaceWith(v))
for k,v in macros.items()] ) + ~FollowedBy(tag)
# return empty string, so macro definitions don't show up in final
# expanded text
return ""
tag_defn.setParseAction(make_macro_sub)
# define pattern for macro scanning
scan_pattern = macro_substitution | tag_defn
sorry = """\
nm: Dave
sorry: en: I'm sorry, nm::, I'm afraid I can't do that.
sorry: es: Lo siento nm::, me temo que no puedo hacer eso.
Hal said, "sorry: en:"
Hal dijo, "sorry: es:" """
print scan_pattern.transformString(sorry)
Prints:
Hal said, "I'm sorry, Dave, I'm afraid I can't do that."
Hal dijo, "Lo siento Dave, me temo que no puedo hacer eso."
More information about the Python-list
mailing list