Here's some I found from a few minutes of futzing around with r66821 of py3k on Linux.
- Having os.getcwdb isn't much use when you can't even run python in the first place when the current directory has "bad" bytes in it.
That's not true: it *is* of much use. Python will live in /usr/bin, which has a nicely-decodable path.
Currently Python outputs: Could not find platform independent libraries <prefix> Could not find platform dependent libraries
Consider setting $PYTHONHOME to <prefix>[: ] Fatal Python error: Py_Initialize: can't initialize sys standard streams ImportError: No module named encodings.utf_8 Aborted
I can't reproduce that. This happens (for me) when Python lives in a directory that has an undecodable path - not when the current directory is undecodable.
- I'd think "find . -type f -print0 | xargs -0 python -c 'pass'" ought to work (with files with "bad" bytes being returned by find), which means that Python shouldn't blow up and refuse to start when there's a non-properly-encoding argv ("Could not convert argument 1 to string" and exiting isn't appropriate behavior).
Contributions are welcome. *Of course* can you access these files with POSIX API. However, Python's path handling can't. See above why I don't consider this as a serious bug, on Unix.
- Of course, just being able to start the interpreter isn't quite enough: you'll want to be able to access that argument list too, somehow (add sys.argvb?).
Perhaps. However, I don't see the need to be able to do so in Python 3.0.
- And then, getopt and optparse modules should work on bytestring vectors, so that you can use sys.argvb without writing your own argument parser. They don't currently.
And I hope they never will. Using bytes to represent this stuff will just bring back the 2.x status, so some other solution must be found - for 3.1 (or 3.2).
- There's no os.environb for bytewise access to the environment. Seems important.
Not to me. I don't have environment variables with non-ASCII characters in them, and I think few other people do.
I'm sure there's even more APIs dealing with pathnames, command line arguments, or environment variables that ought to be able to handle both bytes and strings, that currently don't.
Please, no. Regards, Martin