Best practice for opening files for newbies?

chris.barker at noaa.gov chris.barker at noaa.gov
Thu Sep 18 20:54:25 CEST 2014


On Thursday, September 18, 2014 9:38:00 AM UTC-7, Chris Angelico wrote:
> On Fri, Sep 19, 2014 at 2:19 AM,  <chris.barker at noaa.gov> wrote:
> > So: there are way too many ways to open a simple file to read or write a bit of text (or binary):
> > open()
> 
> Personally, I'd just use this, all the way through - and not importing
> 
> from io, either. But others may disagree.

well the trick there is that it's a serious trick to work with non-ascii compatible text files if you do that...
 
> Be clear about what's text and what's bytes, everywhere. When you do
> make the jump to Py3, you'll have to worry about text files vs binary
> files, and if you need to support Windows as well as Unix, you need to
> get that right anyway, so just make sure you get the two straight.

yup -- I've always emphasized that point, but from a py2 perspective (and with the built in open() file object, what is a utf-8 encoded file? text or bytes? It's bytes -- and you need to do the decoding yourself. Why make people do that? 

In the past, I started with open(), ignored unicode for a while then when I introduced unicode, I pointed them to codecs.open() (I hadn't discovered io.open yet ). Maybe I should stick with this approach, but it feels like a bad idea.

> Save yourself a lot of trouble later on by keeping 
> the difference very clear.

exactly -- but it's equally clear, and easier and more robust to have two types of files: binary and text, where text requires a known encoding. Rather than three types: binary, ascii text and encoded text, which is really binary, which you can then decode to make text....

Think of somethign as simple and common as loping through the lines in a file!

> And you can save yourself some more
> conversion trouble by tossing this at the top of every .py file you
> 
> write:
> 
> from __future__ import print_function, division, unicode_literals

yup -- I've been thinking of recommending that to my students as well -- particularly unicode_literal
 
> But mainly, just go with the simple open() call and do the job the 
> easiest way you can. And go Py3 as soon as you can, because ...

A discussion for another thread....

Thanks,
    -Chris




More information about the Python-list mailing list