Regex with ASCII and non-ASCII chars

Peter Otten __peter__ at web.de
Wed Jan 31 11:00:43 EST 2007


TOXiC wrote:

> Thx it work perfectly.
> If I want to query a file stream?
> 
>     file = open(fileName, "r")
>     text = file.read()
>     file.close()

Convert the bytes read from the file to unicode. For that you have to know
the encoding, e. g.

      file_encoding = "utf-8" # replace with the actual encoding
      text = text.decode(file_encoding)

>     regex = re.compile(u"(ÿÿ‹ð…öÂ)", re.IGNORECASE)
>     match = regex.search(text)
>     if (match):
>         result = match.group()
>         print result
>         WritePatch(fileName,text,result)
>     else:
>         result = "No match found"
>         print result
> 
> It return "no match found" (the file contain the string "ÿÿ‹ð…öÂ"
> but...).
> Thanks in advance for the help!

Peter



More information about the Python-list mailing list