regex question: backreferences in brackets

Jeremy Jones cypher_dpg at
Fri Dec 28 16:23:50 CET 2001

From the Python Library Reference on regular expressions, I read (in section 4.2.1 - concerning backreferences):

<snip> Inside the "[" and "]" of a character class, all numeric escapes are treated as characters. 

I want to be able to match any character other than a character that I already matched and have a named (or numbered) group for.  For example:

match_string = 'XXX|1|22|333|4444:'
test_compile = re.compile(r'XXX(.)[^|]{1}\1[^|]{2}\1[^|]{3}\1[^|]{4}(.)')
mymatch = test_compile.match(match_string)
if mymatch:
	print "Found a match"
	print "No match found"

The format of the strings that I am trying to match are 3 specific characters followed by some delimter followed by N number of characters other than the delimiter followed by the delimiter, etc.  The above code snipped works and matches the string perfectly.  I guess my question is this:  how can I match any other character except for the delimiter without knowing it beforehand (which I won't know what it is beforehand)?  I am thinking use the backreference to the delimiter (i.e. \1), but you can't but that in the brackets.  I tried putting \1 in the brackets and it sort of worked.  But when I changed one of the numbers to a |, it still matched, which isn't what I want (and actually, it doesn't match if I leave the | hard-coded).  I also tried using named backreferences like (?P<delimiter>.) rather than (.) and tried putting (?P=delimiter) in the brackets and no luck.  Any suggestions?  TIA.

Jeremy Jones

More information about the Python-list mailing list