emacs lisp text processing example (html5 figure/figcaption)
xahlee at gmail.com
Tue Jul 5 22:37:18 CEST 2011
On Jul 5, 12:17 pm, Ian Kelly <ian.g.ke... at gmail.com> wrote:
> On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee <xah... at gmail.com> wrote:
> > So, a solution by regex is out.
> Actually, none of the complications you listed appear to exclude
> regexes. Here's a possible (untested) solution:
> <div class="img">
> ((?:\s*<img src="[^.]+\.(?:jpg|png|gif)" alt="[^"]+" width="[0-9]+"
> \s*<p class="cpt">((?:[^<]|<(?!/p>))+)</p>
> and corresponding replacement string:
> I don't know what dialect Emacs uses for regexes; the above is the
> Python re dialect. I assume it is translatable. If not, then the
> above should at least work with other editors, such as Komodo's
> "Find/Replace in Files" command. I kept the line breaks here for
> readability, but for completeness they should be stripped out of the
> final regex.
> The possibility of nested HTML in the caption is allowed for by using
> a negative look-ahead assertion to accept any tag except a closing
> </p>. It would break if you had nested <p> tags, but then that would
> be invalid html anyway.
emacs regex supports shygroup (the 「(?:…)」) but it doesn't support the
negative assertion 「?!…」 though.
but in anycase, i can't see how this part would work
More information about the Python-list