One thing I noticed is that it is placing an arbitrary space between " and />. For example:<br><br><br><root><frame type="image" /></root><br><br>Notice that there's a space between 

<span style="font-weight: bold;">"image"</span> and <span style="font-weight: bold;">/></root></span><br><br>Any way to fix this? Thanks.<br><br><div><span class="gmail_quote">On 9/24/07, <b class="gmail_sendername">

Gabriel Genellina</b> <<a href="mailto:gagsl-py2@yahoo.com.ar">gagsl-py2@yahoo.com.ar</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

En Mon, 24 Sep 2007 23:51:57 -0300, Robert Dailey <<a href="mailto:rcdailey@gmail.com">rcdailey@gmail.com</a>><br>escribi�:<br><br>> What I meant was that it's not an option because I'm trying to learn<br>

> regular<br>> expressions. RE is just as built in as anything else.<br><br>Ok, let's analyze what you want. You have for instance this text:<br>"<action></action>"<br>which should become<br>

"<action/>"<br><br>You have to match:<br>(opening angle bracket)(any word)(closing angle bracket)(opening angle<br>bracket)(slash)(same word as before)(closing angle bracket)<br><br>This translates rather directly into this regular expression:

<br><br>r"<(\w+)></\1>"<br><br>where \w+ means "one or more alphanumeric characters or _", and being<br>surrounded in () creates a group (group number one), which is<br>back-referenced as \1 to express "same word as before"

<br>The matched text should be replaced by (opening <)(the word<br>found)(slash)(closing >), that is: r"<\1/>"<br>Using the sub function in module re:<br><br>py> import re<br>py> source = """

<br>... <root></root><br>... <root/><br>... <root><frame type="image"><action></action></frame></root><br>... <root><frame type="image"><action/></frame></root>

<br>... """<br>py> print re.sub(r"<(\w+)></\1>", r"<\1/>", source)<br><br><root/><br><root/><br><root><frame type="image"><action/></frame></root>

<br><root><frame type="image"><action/></frame></root><br><br>Now, a more complex example, involving tags with attributes:<br><frame type="image"></frame>  -->  <frame type="image" />

<br><br>You have to match:<br>(opening angle bracket)(any word)(any sequence of words,spaces,other<br>symbols,but NOT a closing angle bracket)(closing angle bracket)(opening<br>angle bracket)(slash)(same word as before)(closing angle bracket)

<br><br>r"<(\w+)([^>]*)></\1>"<br><br>[^>] means "anything but a >", the * means "may occur many times, maybe<br>zero", and it's enclosed in () to create group 2.<br>

<br>py> source = """<br>... <root></root><br>... <root><frame type="image"></frame></root><br>... """<br>py> print re.sub(r"<(\w+)([^>]*)></\1>", r"<\1\2 />", source)

<br><br><root /><br><root><frame type="image" /></root><br><br>Next step would be to allow whitespace wherever it is legal to appear -<br>left as an exercise to the reader. Hint: use \s*<br>

<br>--<br>Gabriel Genellina<br><br>--<br><a href="http://mail.python.org/mailman/listinfo/python-list">http://mail.python.org/mailman/listinfo/python-list</a></blockquote></div><br>