[Python-Dev] Python-3.0, unicode, and os.environ

André Malo nd at perlig.de
Tue Dec 9 10:42:32 CET 2008


* M.-A. Lemburg wrote: 


> On 2008-12-09 09:41, Anders J. Munch wrote:
> > On Sun, Dec 7, 2008 at 3:53 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> >>>> try:
> >>>>  files = os.listdir(somedir, errors = strict)
> >>>> except OSError as e:
> >>>>  log(<verbose error message that includes somedir and e>)
> >>>>  files = os.listdir(somedir)
> >
> > Instead of a codecs error handler name, how about a callback for
> > converting bytes to str?
> >
> > os.listdir(somedir, decoder=bytes.decode)
> > os.listdir(somedir, decoder=lambda b: b.decode(preferredencoding,
> > errors='xmlcharrefreplace')) os.listdir(somedir, decoder=repr)
> >
> > ISTM that would be simpler and more flexible than going over the
> > codecs registry.  One caveat though is that there's no obvious way of
> > telling listdir to skip a name.  But if the default behaviour for
> > decoder=None is to skip with a warning, then the need to explicitly
> > ask for files to be skipped would be small.
> >
> > Terry's example would then be:
> >>>> try:
> >>>>  files = os.listdir(somedir, decoder=bytes.decode)
> >>>> except UnicodeDecodeError as e:
> >>>>  log(<verbose error message that includes somedir and e>)
> >>>>  files = os.listdir(somedir)
>
> Well, this is not too far away from just putting the whole decoding
> logic into the application directly:
>
> files = [filename.decode(filesystemencoding, errors='warnreplace')
>          for filename in os.listdir(dir)]
>
> (or os.listdirb() if that's where the discussion is heading)
>
> ... and that also tells us something about this discussion: we're
> trying to come up with some magic to work around writing two
> lines of Python code.
>
> I'd just have all the os APIs return bytes and leave whatever
> conversion to Unicode might be necessary to a higher level API.

[...]

What I'm saying ;-)

+1.

nd


More information about the Python-Dev mailing list