[Python-Dev] String terminology [was Re: Misc re.match() complaint]

Steven D'Aprano steve at pearwood.info
Wed Jul 17 13:50:42 CEST 2013


On 17/07/13 19:05, Terry Reedy wrote:

> Saying that input arguments can be "Unicode strings as well as 8-bit strings' (the wording is from 2.x, carried over to 3.x) does not necessary exclude other inputs.

"8-bit strings" seems somewhat ambiguous to me. In UTF-8, many Unicode strings are 8-bit, as they can be with Python 3.3's flexible string format. I prefer to stick to Unicode or text string, versus byte string.

Pedants who point out that "byte" does not necessarily mean 8-bits, and therefore we should talk about octets, will be slapped with a large halibut :-)


-- 
Steven








More information about the Python-Dev mailing list