[Tutor] Regular Expression guru saught
Kirk Bailey
idiot1@netzero.net
Mon Aug 4 22:27:01 EDT 2003
Well, as an example, here's the code of the moment:
http;//www.tinylist.org/wikinehesa.txt
and here is the mostly working reader tself:
http://www.tinylist.org/cgi-bin/wikinehesa.py
Works, reads, even manages a somewhat crufty parsing into paragraphs,
but need to handle the <b></b> and <i></i> matter.
Sean 'Shaleh' Perry wrote:
> On Monday 04 August 2003 11:01, Jeff Shannon wrote:
>
>>Kirk Bailey wrote:
>>
>>>This thing is just flat going to need a lot of re stuff, and I need
>>>therefore to ome up to speed on re.
>>
>>I'm not so sure that re's are quite what you want -- or at least, I'm
>>not sure if re's are enough.
>>
>>The problem with re's is that they're not very good at handling nested
>>data structures. It's often mentioned that re's are not appropriate for
>>parsing HTML or XML because of this limitation, and I suspect that the
>>same will apply to parsing your simple wiki code as well. You could
>>perhaps write re's that will handle the majority of likely cases, but
>>(if I'm right) it's almost assured that eventually, someone will write a
>>wiki page that can't be properly parsed with a re-based approach.
>>
>
>
> Indeed. The book "Text Processing in Python" may be of value here. Covers
> simple string methods, re's and real parsers. For me it turned out to cover
> mostly stuff I already knew but for someone just getting into text processing
> it is likely to be pretty valuable.
>
> Beyond that, Kirk as always an example is worth a thousand threads (-:
>
>
> _______________________________________________
> Tutor maillist - Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
>
--
--
end
Cheers!
Kirk D Bailey
+ think +
http://www.howlermonkey.net +-----+ http://www.tinylist.org
http://www.listville.net | BOX | http://www.sacredelectron.org
Thou art free"-ERIS +-----+ 'Got a light?'-Promethieus
+ think +
Fnord.
More information about the Tutor
mailing list