On Sat, Apr 14, 2012 at 18:56, Guido van Rossum <guido@python.org> wrote:
On Sat, Apr 14, 2012 at 3:50 PM, Brett Cannon <brett@python.org> wrote:
> On Sat, Apr 14, 2012 at 18:32, Guido van Rossum <guido@python.org> wrote:
>> Funny, I was just thinking about having a simple standard API that
>> will let you open files (and list directories) relative to a given
>> module or package regardless of how the thing is loaded. If we
>> guarantee that there's always a __loader__ that's a first step, though
>> I think we may need to do a little more to get people who currently do
>> things like open(os.path.join(os.path.basename(__file__),
>> 'some_file_name') to switch. I was thinking of having a stdlib
>> function that you give a module/package object, a relative filename,
>> and optionally a mode ('b' or 't') and returns a stream -- and sibling
>> functions that return a string or bytes object (depending on what API
>> the user is using either the stream or the data can be more useful).
>> What would we call thos functions and where would the live?

> IOW go one level lower than get_data() and return the stream and then just
> have helper functions which I guess just exhaust the stream for you to
> return bytes or str? Or are you thinking that somehow providing a function
> that can get an explicit bytes or str object will be more optimized than
> doing something with the stream? Either way you will need new methods on
> loaders to make it work more efficiently since loaders only have get_data()
> which returns bytes and not a stream object. Plus there is currently no API
> for listing the contents of a directory.

Well, if it's a real file, and you need a stream, that's efficient,
and if you need the data, you can read it. But if it comes from a
loader, and you need a stream, you'd have to wrap it in a StringIO
instance. So having two APIs, one to get a stream, and one to get the
data, allows the implementation to be more optimal -- it would be bad
to wrap a StringIO instance around data only so you can read the data
from the stream again...

Right, so you would need to grow, which is fine and can be done in a backwards-compatible way using io.BytesIO and StringIO.
 

> As for what to call such functions, I really don't know since they are
> essentially abstract functions above the OS which work on whatever storage
> backend a module uses.
>
> For where they should live, it depends if you are viewing this as more of a
> file abstraction or something that ties into modules. For the former it
> seems like shutil or something that dealt with higher order file
> manipulation. If it's the latter I would say importlib.util.

if pkg_resources is in the stdlib that would be a fine place to put it.

It's not.

-Brett
 

--
--Guido van Rossum (python.org/~guido)