[Tutor] RE problems
Brandon Bennett
bennetb at gmail.com
Fri Aug 6 23:38:52 CEST 2004
I think this is a classic example of the greedy .*
Use "(<[^>]*>)"
This is match all characters between the < > that is not > (the ending tab.
~Brandon
On Fri, 6 Aug 2004 17:23:07 -0400, James Alexander McCarney
<james.mccarney at cgi.com> wrote:
> Hi tutors,
>
> I am having problems returning everything I want from a regular expression.
> I am merely getting the first string in the html text file, which stands to
> reason as per the code.
>
> Could someone give me the magic to put all the strings tagged < > thus.
>
> Thanks for any tips you can provide. As for the document,
>
> I am reading amk's RE how-to; and I know it's all in there; it's just that
> I've cudgeled my brains a lot today. ;-(
>
> Best regards,
> James
>
> ##
> ##
> import pythoncom
> from win32com.client import Dispatch
> import re
>
> app = Dispatch('Word.Application')
> app.Visible = 1
>
> doc = app.Documents.Add()
>
> f = open("C:\myfile.html")
>
> ##
> ##
> ##
> ##
>
> allines = f.read()
> p=re.compile(r"(<.*?>)",re.DOTALL)
> m=p.match(allines)
>
> ##
>
> s1 = doc.Sentences(1)
> s1.Text = m.group()
>
> doc.SaveAs("C:\myTestPy.doc')
>
> app.Quit()
> app = None
>
> pythoncom.CoUninitialize()
> f.close()
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list