Newbie question: strange Re behaviour -- Many thanks

Patrick.Bussi at space.alcatel.fr Patrick.Bussi at space.alcatel.fr
Fri Apr 19 04:31:06 EDT 2002



Terrific. It works fine.

I understand now that I had missed something essential about regex.

Thank you Eric for this tutorial. Good bye.

---
Patrick Bussi
patrick.bussi at space.alcatel.fr


Any opinions expressed are my own and not necessarily those of my Company.

---------------------- Envoyé par Patrick Bussi/ALCATEL-SPACE le 19/04/2002
10:26 ---------------------------


Eric Brunel <eric.brunel at pragmadev.com> on 15/04/2002 19:26:42

Pour :    python-list at python.org
cc :   (ccc : Patrick Bussi/ALCATEL-SPACE)
Objet :   Re: Newbie question: strange Re behaviour


-------------- next part --------------

Patrick.Bussi at space.alcatel.fr wrote:
>
> Could someone help me understand what's wrong with this very simple regex,
> whose initial purpose was to extract the value of an HTML field. For
> demonstration, I have oversimplified the code below:
>
> -----start-----
> [pat]$ python
> Python 2.0 (#1, Apr 11 2001, 19:18:08)
> [GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux-i386
> Type "copyright", "credits" or "license" for more information.
>>>>
>>>> import re
>>>> s='[^value=].*[^>]'

That means "any character not in the set ('v', 'a', 'l', 'u', 'e', '='),
followed by any string, followed by a character that is not a '>'". I guess
that's not what you want... but it explains why it ignores the leading 'a'
(which *is* in the set ('v', 'a', 'l', 'u', 'e', '=')).

Try:
s = "value=(.*)>"
and see what group(1) returns after compiling the re...

HTH
--
- Eric Brunel <eric.brunel at pragmadev.com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com

--
http://mail.python.org/mailman/listinfo/python-list


More information about the Python-list mailing list