regex: im getting better

Duncan Booth duncan at
Thu Oct 3 10:26:20 CEST 2002

":B nerdy" <thoa0025 at> wrote in
news:uDLm9.21074$kd3.60008 at 

> $pattern = '|<input(\s+([^=>]*)="([^"]*)")*>|ism';
> i'd like to match all the input tags's but also in a subexpression,
> i'd like to match each of the parameters in the format
> parameter_name="parameter_value"
> where parameter_name and parameter_value are strings
> my pattern doesnt work, it only matches the last parameter, whats
> wrong with my pattern? and can someone show me how one would match my
> description above?
> cheers

Personally I wouldn't even consider using regular expressions for a parsing 
task like this. Try the code below instead:

import sgmllib

class MyParser(sgmllib.SGMLParser):
    def do_input(self, attributes):
        print "Input tag",attributes

if __name__=='__main__':
    data = '''
<input x="1" y="2">
<input p="q" r="s">
    parser = MyParser()

Duncan Booth                                             duncan at
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?

More information about the Python-list mailing list