No, it doesn't, that is the whole point of why I started this thread!!!!
Oops, right. I was thinking the other way around: passing u"a.out" where "a.out" is expected works fine; for this case, the memory management issues come into play.
Using Python StringObjects as binary buffers is also far less common than using StringObjects to store plain old strings, so if either of these uses bites the other it's the binary buffer that needs to suffer.
This is a conclusion I cannot agree with. Most strings are really binary, if you look at them closely enough :-)
I'm not sure I understand this remark. If you made it just for the smiley: never mind. If you really don't agree: please explain why.
When the discussion of tagging binary strings in source code came up, I started to look into the standard library which string literals would have to be tagged as byte strings, and which are really character strings.
I found that the overwhelming majority of string literals in the standard Python library really denotes byte strings, if you ignore doc strings. Sometimes, it isn't obvious that they are binary strings, hence the smiley. Look at httplib.py:
__all__ = ["HTTP", ...
Not sure: Are Python function names byte strings or character strings? Probably doesn't matter either way. Python source code is definitely byte-oriented, explicitly wihthout any assumed encoding, so I'd lean towards byte strings here.
_UNKNOWN = 'UNKNOWN'
Looks like a character string. However, it is used in
self.version = _UNKNOWN # HTTP-Version
self.version is later sent on the byte-oriented HTTP protocol, so _UNKNOWN *is* a byte string.
_CS_IDLE = 'Idle'
These are enumerators, let's say they are character strings.
self.fp = sock.makefile('rb', 0)
Not sure. Could be a character string.
print "reply:", repr(line)
Definitely a character string.
version = "HTTP/0.9" status = "200" reason = ""
Protocol elements, thus byte string.
So, I'm arguing that byte strings are far more common than you may think at first sight. In particular, everything passed to .read(), either of a file, or of a socket, is a byte string, since files and network connections are byte-oriented. In the particular case of network connections, applying system conventions for narrow strings would be foolish.