On Fri, Mar 6, 2020 at 5:54 PM Guido van Rossum <guido@python.org> wrote:

(Since bytes may be used for file names I think they should get this new capability too.)

I don’t really care one way or another, but is it really still the case that bytes need to be used for filenames? For uses other than just passing them around?

Yes, Linux in particular does not guarantee that file names are using any particular encoding (let alone a consistent encoding for different files). The only two bytes that are special are '\0' and '/'.

I *think* I understand the issues. And I can see that some software would need to work with filenames as arbitrary bytes. But that doesn't mean that you can do much with them that way.

I can see filename.split(b'/') for instance, but how could you strip a prefix or suffix without knowing the encoding? filename.strip_suffix(b'.txt') would only work for ASCII-compaitble encodings. There's no way around the fact that you have to make SOME assumptions about the encoding if you are going to do anything other than pass it around or work with the b'/' byte. And if that's the case, then you might as well decode and use 'surrogateescape' so the program won't crash.

Getting OT, but I do wonder if we should continue to support (and therefor encourage) the use of bytes in inappropriate ways.

I didn't like the name stripstr anyway. :-)

Neither do I, so I guess I shouldn't have brought this up ...

-CHB



On Fri, Mar 6, 2020 at 16:10 Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Mar 06, 2020 at 03:33:49PM -0800, Ethan Furman wrote:

> I think we should have a `stripstr()` as an alias for strip, and a new
> `stripchr()`.

Shouldn't they be the other way around?

`strip` removes chars from a set of chars; the proposed method will
remove a prefix/suffix.


> And I'm perfectly okay with bytes() not having those methods.  ;-)

If heavy users of bytes want these methods, they can request them
separately. There's no backwards compatibility requirement for new
string methods to be automatically added to bytes.

I guess the question now is do we need a PEP?

--
--Guido van Rossum (python.org/~guido)


--
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython