a simple unicode question
gagsl-py2 at yahoo.com.ar
Thu Oct 22 11:23:32 CEST 2009
En Wed, 21 Oct 2009 15:14:32 -0300, <rurpy at yahoo.com> escribió:
> On Oct 21, 4:59 am, Bruno Desthuilliers <bruno.
> 42.desthuilli... at websiteburo.invalid> wrote:
>> beSTEfar a écrit :
>> > When parsing strings, use Regular Expressions.
>> And now you have _two_ problems <g>
>> For some simple parsing problems, Python's string methods are powerful
>> enough to make REs overkill. And for any complex enough parsing (any
>> recursive construct for example - think XML, HTML, any programming
>> language etc), REs are just NOT enough by themselves - you need a full
>> blown parser.
> But keep in mind that many XML, HTML, etc parsing problems
> are restricted to a subset where you know the nesting depth
> is limited (often to 0 or 1), and for that large set of
> problems, RE's *are* enough.
I don't think so. Nesting isn't the only problem. RE's cannot handle
comments, by example. And you must support unquoted attributes, single and
double quotes, any attribute ordering, empty tags, arbitrary whitespace...
If you don't, you are not reading XML (or HTML), only a specific file
format that resembles XML but actually isn't.
More information about the Python-list