[Python-Dev] Python3 "complexity"

Steven D'Aprano steve at pearwood.info
Thu Jan 9 13:28:54 CET 2014


On Thu, Jan 09, 2014 at 05:11:06PM +1000, Nick Coghlan wrote:
> On 9 January 2014 10:07, Ben Finney <ben+python at benfinney.id.au> wrote:

> > So, if what you want is to parse text and not get gibberish, you need to
> > *tell* Python what the encoding is. That's a brute fact of the world of
> > text in computing.
> 
> Set the mode to "rb", process it as binary. Done.

A nice point, but really, you lose a lot by doing so. Even simple things 
like the ability to write:

    if word[0] == 'X'

instead you have to write things like:

    if word[0:1] = b'X'
    if chr(word[0]) == 'X'
    if word[0] == ord('X')
    if word[0] == 0x58

(pick the one that annoys you the least). And while bytes objects 
do have a surprising (to me) number of string-ish methods, like 
upper(), there are a few missing, like format() and isnumeric(). So it's 
not quite as straightforward as "done". If it were, we wouldn't need 
text strings :-)



-- 
Steven


More information about the Python-Dev mailing list