M.-A. Lemburg
Mon, 25 Jun 2001

Mark Davis wrote:
> > My question was targetting into a slightly different direction,
> > though. I know that UTF-16 does not allow lone surrogates, but
> > how does Unicode itself treat these ? If I have a sequence of Unicode
> > code points which includes an isolated surrogate code point,
> > would this be considered a legal Unicode sequence or not ?
> It is a legal Unicode code point sequence. However, it is not a legal
> Unicode *character* sequence, since it contains code points that by
> definition cannot be used to represent characters.

So its basically a matter of viewing a string as sequence
of characters vs. sequence of code points.

Thanks for the explanation,
Marc-Andre Lemburg
