[Python-Dev] unicode regex quickie: should a newline be the same thing as a linebreak?

Fredrik Lundh fredrik@pythonware.com
Tue, 30 May 2000 17:19:48 +0200


M.-A. Lemburg wrote:
> > At the other end, the same compiled pattern can be applied
> > to either 8-bit or unicode strings.  It's all just characters to
> > the engine...
>=20
> Doesn't the engine remember wether the pattern was a string
> or Unicode ?

The pattern object contains a reference to the original pattern
string, so I guess the answer is "yes, but indirectly".  But the core
engine doesn't really care -- it just follows the instructions in the
compiled pattern.

> Thinking about this some more: I wouldn't even mind if
> the engine would use LINEBREAK for all strings :-). It would
> certainly make life easier whenever you have to deal with
> file input from different platforms, e.g. Mac, Unix and
> Windows.

That's what I originally proposed (and implemented).  But this may
(in theory, at least) break existing code.  If not else, it broke the
test suite ;-)

</F>

<project name=3D"sre" complete=3D"97.1%" />