
Thanks for all the replies. Anto, re: "- some_unicode.encode('utf-8') is essentially for free (because it is already UTF-8 internally) - some_bytes.decode('utf-8') is very cheap (it just needs to check that some_bytes is valid utf-8)" I guess you mean the processing load for such operations will be low. So that's good then. Just wish they would both go away ... Matt, re: "The defaults are generally better for the programming most people do imo." Probably correct, just got spoiled, that's all. Had a glimmer of hope that the need for either would vanish, and wishing someone knew how. Dan, re: "I think you mostly don't want u'foo' in 3.x or b'foo' in 2.x" Actually, I don't want either, anywhere. If UTF8 is used internally, and ASCII is already UTF8, then it is all UTF8, so ... Sigh ... Thanks anyhow, Jerry S.

On Thu, Feb 27, 2020 at 11:54:45AM -0600, Jerry Spicklemire wrote:
If you're not writing libs, you only need b'' in Python3 -- bytes and binary data is not utf8; you need bytes for files, networks, compression, etc and in Python3 you'll get errors where you try to use strings in places that need bytes... I think the time where you could work just in ascii is pretty much gone -- even in the US, people use emoji all over the place now, and you can only store such things using some form of unicode. m -- Matt Billenstein matt@vazor.com http://www.vazor.com/

On Thu, Feb 27, 2020 at 11:54:45AM -0600, Jerry Spicklemire wrote:
If you're not writing libs, you only need b'' in Python3 -- bytes and binary data is not utf8; you need bytes for files, networks, compression, etc and in Python3 you'll get errors where you try to use strings in places that need bytes... I think the time where you could work just in ascii is pretty much gone -- even in the US, people use emoji all over the place now, and you can only store such things using some form of unicode. m -- Matt Billenstein matt@vazor.com http://www.vazor.com/
participants (2)
-
Jerry Spicklemire
-
Matt Billenstein