`importlib.resources` access whole directories as resources

An `importlib.resources.as_file` equivalent but for whole directories. To access a directory of files in a package and load them (for example, a skybox with 6 faces), one would need to use `as_file` 6 times with 6 context managers. Moreover, if the API required a path to a folder that contained the 6 images, this would require manual extraction.

On Sun, 2022-05-08 at 17:36 +0000, tankimarshal2@gmail.com wrote:
Hi, Can you provide examples of use-cases that would require something like that? files() returns a Traversable, which is a subset of pathlib.Path (in practice, if the module exists on the filesystem it would actually be a pathlib.Path instance), which you can use to access the data. as_file() exists because needing a file on the file system is sort of common, but from my experience the same is not true for directories. While I can see the value in what you propose, I worry it is not a common enough use-case. Some code examples would definitely help out here. Cheers, Filipe Laíns

I think this is a bit of an "XY" feature request. Currently, resources must be individual files inside a package. A directory cannot itself be a "resource". So for example if you have a directory structure like this: my_great_app/ __init__.py something.py data/ assets/ images/ a.png a.png c.png Each of data/ assets/ and images/ must also be a package, with its own __init__.py file. You cannot access the resource data/assets/images/a.png in the package my_great_app, you must access the resource a.png in the package my_great_app.data.assets.images. This is (in my opinion) unintuitive, easy to forget, and moderately annoying. So I think the feature request here is that Python start allowing directories as "resources", rather than just single files within a package. Alternatively, if for some reason directories cannot themselves be resources, allowing file resources in subdirectories (without creating a new subpackage) would also be a nice ergonomic improvement. I'm not sure if this poses issues for package resolution, namespace packages, etc. I imagine that this somewhat-obvious feature was omitted for a good reason.

I honestly wasn't aware that this was added in 3.7 -- I always thought handling resources was up to setuptools or hand-written code. So it's nice to see that it's there in the stdlib. And while I haven't actually used this feature, I have read through the docs and have a few comments. On Sun, May 15, 2022 at 7:21 AM Greg Werbin <outthere@me.gregwerbin.com> wrote:
Currently, resources must be individual files inside a package. A directory cannot itself be a "resource".
According to the docs, " a *resource* is a binary artifact that is shipped within a package." a directory is not a binary artifact -- it can't have actually data in it like a file can. So I'm not sure how it wouel make sense for a directory to be a resource.
It doesn't seem that bad to me -- my thoughts is that if you have that much structure to your resources, maybe the resource system is not the right tool for the job. nevertheless ...
That might be possible, and would make sense to me. However, the entire point of resources is to provide an abstraction -- the individual resources may not be files on disk at all -- so extending the nested path concpet may not make sense. That being said, I also note that importlib.abc.ResourceReader is an ABC -- so I think you could very well make your own ResourceReader that could traverse a directory hierarchy if you wanted to. """ And the contents() method is defined as: Returns an iterable of strings over the contents of the package. Do note that it is not required that all names returned by the iterator be actual resources, e.g. it is acceptable to return names for which is_resource() would be false. Allowing non-resource names to be returned is to allow for situations where how a package and its resources are stored are known a priori and the non-resource names would be useful. For instance, returning subdirectory names is allowed so that when it is known that the package and resources are stored on the file system then those subdirectory names can be used directly. """ which implies to me that the system is expected to optionally handle subdirs. Perhaps you can write a ResourceResader that meets your needs, and it turns out to be generally useful, it could be later added to the stslib. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On 16/05/22 5:05 pm, Christopher Barker wrote:
a directory is not a binary artifact -- it can't have actually data in it like a file can.
and:
These two statements are contradictory. If a resource is an abstraction, why can't it be represented by a collection of files in a directory rather than a single file? -- Greg

On Sun, May 15, 2022 at 11:57 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have no idea why you say that.
If a resource is an abstraction, why can't it be represented by a collection of files in a directory rather than a single file?
if you want to call your collection of files a single resource, then sure -- but then it's not the directory that's the resource, it's the collection of files that's the resource -- again, it's an abstraction. I see this a bit like how you can't add an empty directory to git -- a directory has no information to add -- i.e. the path is not the file, and a directory is only part of a path. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Mon, 2022-05-16 at 18:55 +1200, Greg Ewing wrote:
It's not contradictory. "the individual resources may not be files on disk at all" hints that the resource might be for eg. an entry on a database that contains binary data. Directory are not binary data, they cannot be a resource. Filipe Laíns

Non-"binary" resources are already in widespread use, so perhaps that requirement shouldn't be in the docs at all. In practice, a resource is "any data file other than a Python source file." Moreover, I see no reason why a resource name in general shouldn't be allowed to contain a "/" character! Perhaps the simplest way to implement such a thing is to leave all the existing importlib.resources APIs unchanged, but then add a separate function that grants access to collections of resources within a package. Maybe this is as simple as adding a `dir=` kwarg to importlib.resources.files(), defaulting to `dir=None`.

On Mon, May 16, 2022 at 7:38 AM Greg Werbin <outthere@me.gregwerbin.com> wrote:
Non-"binary" resources are already in widespread use,
I didn't write the docs -- but a text file is, indeed a binary blob -- it only becomes text when it is read and decoded, so there is no distinction at this level. I *think* the idea behind that is that a resource is not necessarily a file on disk -- which is why it's not called a file. It is a blob of bytes (i.e. binary) that is stored somehow and can be retrieved somehow. In practice, I imagine most resources are indeed files, or at least start that way before they are packed into a zip archive or something.
so perhaps that requirement shouldn't be in the docs at all. In practice, a resource is "any data file other than a Python source file."
See above -- a resource may not be a file.
Moreover, I see no reason why a resource name in general shouldn't be allowed to contain a "/" character!
I agree here -- this may very well be the solution -- have any of you tested it? I have not -- it may well be possible right now. Is there any code checking and rejecting resource names with a slash in them? BTW -- can someone point to a good example of resources in action -- I find it a bit challenging to work only fjrom documentation -- I really like concrete examples. Another note: IIUC, the entire point of the "resource" abstraction is because they may not be files on disk (at least at run time). I, and many others have included data file in packages in a very simple way: 1) Add the file to the directory where you want it. 2) Make sure your package builder (e.g. setuptools) copies the file into the package on installation 3) Find the file at run time with a relative path to the __fille_ attribute in a parent package. This is simple, straightforward and works great -- as long as your package is only used with a conventional install (set zipsafe to False for setuptools. But it breaks when the package is installed some other way -- as a zip file, maybe by PyInstaller, or who knows what? I'm pretty sure THAT is the problem that importlib resources is trying to solve in an abstract way. From: https://importlib-resources.readthedocs.io/en/latest/using.html """ What exactly do we mean by “a resource”? It’s easiest to think about the metaphor of files and directories on the file system, though it’s important to keep in mind that this is just a metaphor. Resources and packages *do not* have to exist as physical files and directories on the file system. """ Note from that document, there is this example: from importlib_resources import files # Reads contents with UTF-8 encoding and returns str. eml = files('email.tests.data').joinpath('message.eml').read_text() In that case, there is a directory, `data`, that has a __init__.py file in it, making it a package. Note that you still need to use joinpath to make the final path to your resource -- what happens if you add a nested path there? eml = files('email.tests.data').joinpath('subdir/message.eml').read_text() It may "just work" :-) Keep in mind that the other point of importlib.resources is to leverage the import system -- the import system defines a package as a dir with a __init__ -- and it already knows how to find files (or abstactions of files) in a package. There is no such thing as a "resource directory" as distinct from a package -- and I don't think adding that complication is worth saving the effort of adding a __init__.py to a dir in which you want to store resources. After all, if you want to put modules in a nested dir, you have to add a __init__.py to the dir as well. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Mon, 2022-05-16 at 14:36 +0000, Greg Werbin wrote:
No, Python source files are resources too. "resource" is an abstract concept akin to files, its propose is to allow support other use-cases than just files on the OS file system (eg. zip file, tarball, database). Adding a "directory" reference goes against the purpose of abstracting the FS away. Packages are akin to directories, files are akin to resources, when operating on a FS. Cheers, Filipe Laíns

I think I might have confused things further, my apologies. I see 3 related but distinct feature requests here: 1) Allow a directory to be a resource in and of itself, and provide APIs for working with "directory-type" resources. 2) Allow resources with "/" in the name, and provide APIs to interact with such resources as if the were filesystem directories, remaining agnostic as to whether they actually are filesystem directories, to the best extent possible. This is similar to how cloud storage providers like Amazon S3 handle slashes in names. This would mean that you can put resources in subdirectories of package directories that are not themselves subpackages. 3) Provide APIs for working with resources within packages in a fashion that more closely resembles directory and subdirectory relationships. I am not sure exactly what this would entail. As discussed above, (1) is a significant deviation from the existing model of "resources" as described in the docs and implemented in importlib.resources, so it's probably not a viable option. I am not sure what (3) would consist of, but I imagine that there is continued room for improvement over importlib.resources.files(). (2) seems like a useful feature, if only because it seems more natural to write "assets/images/foo.png" in resource "my_app.data", as opposed to "foo.png" in resource "my_app.data.assets.images". But as stated above, maybe it's too little benefit to warrant changing the model where in a directory is analogous to a package.

On Tue, May 17, 2022 at 1:46 PM Greg Werbin <outthere@me.gregwerbin.com> wrote:
1) Allow a directory to be a resource in and of itself, and provide APIs for working with "directory-type" resources.
I'm still confused here -- what would a "directory be a resource" even mean? A directory is not a file, it is a "container" of files. I still have no idea what people mean when they say a directory could be a resource. If we go back to the OPs point, they were saying that they didn't like that you have to put an __init__.py in a dir in order to put resources in it. And I get that idea, but that wouldn't be making a dir a resource, it would be allowing a directory structure inside a package in which to put resources. Remember that the directory that holds a package (has an __init__.py file) is not itself a package, if anything, the __init__.py file is the resource here.
Sure, this could make sense -- but a point here: The whole "resource" thing in import lib is, well, part of the import system. IIUC, it is provided so that the existing system built for loading modules can be used for things other than Python modules. It was not designed to be general purpose. No one seems to mind that with the following file structure: pkg/ __init__.py a_module.py a_subdir/ something.py you can't do: from pkg.a_subdir import something or from pkg import a_subdir/something I think one reason a dir has to have a __init__.py to be a package (and to hold other modules/packages) is that a dir is NOT a file, i.e. not a resource, i.e. has no way to hold any information. The __init__.py is needed so that when you import a package, there's a way to put some names in that namespace. That is -- the directory itself can not be a package -- it doesn't have a "binary blob" at all.
Isn't that the same as (2) ? As discussed above, (1) is a significant deviation from the existing model
of "resources" as described in the docs and implemented in importlib.resources, so it's probably not a viable option.
Exactly -- if you want a more rich resource handling system, maybe look to a third party package. How does setuptools' pkg_resources handle all this for instance?
I'm confused again -- in this case, foo.png is a resource, my_app.data is not a resource, and nor is, or will be, assets/images.
I think the way it works now is that you'd get the resource something like this:
To get the path to a actual file:
files(my_app.data.assets.images).joinpath('foo.png')
(or my prefered syntax:) files(my_app.data.assets.images) / ('foo.png') To get the binary data (which I think is the whole point to this):
foo_img = resource_bytes('my_app.data.assets.images', 'foo.png')
of course, you can first do: from my_app.data.assets import images and then: foo_img = resource_bytes(images, 'foo.png') In order for that to work, the data, assets, and images dirs all need to have an __init__.py file, because the first argument to resource_bytes has to be a package. Again, is that really such a heavy lift ? But I'm a bit confused as to what's being proposed here. If dirs with only non-python module resources didn't need a __init__.py, would you be able to access them the same way? or are we thinking it would be: foo_img = resource_bytes('my_app.data','assets/images/foo.png') which probably wouldn't work today (untested). However, if you only want to support actual files on disk, this would work: foo_img = open(files(my_app.data) / 'assets/images/foo.png', 'b').read() It might take some machinations to get your build system to copy those files in the right place, but it should work just fine. By the way: In [5]: resources.is_resource? Signature: resources.is_resource(package: Union[str, module], name: str) -> bool Docstring: True if 'name' is a resource inside 'package'. Directories are *not* resources. And check this out: In [6]: resources.is_resource('importlib','__init__.py') Out[6]: True So the __init__.py file is a resource. and: In [9]: resources.is_resource('importlib','metadata') Out[9]: False metadata is a sub-package inside importlib, it has an __init__.py, but it is not a resource itself. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Tue, 2022-05-17 at 23:04 -0700, Christopher Barker wrote:
And this is no longer true as of importlib_resources 3.2.0 or Python 3.10, as we added support for resources in namespace packages. https://github.com/python/importlib_resources/pull/196 I agree with the point that directories are not resources, in fact: https://github.com/python/importlib_resources/blob/main/importlib_resources/... Cheers, Filipe Laíns

On 16/05/2022 23:22, Filipe Laíns wrote:
This is a very important point that Christopher Barker has also made several times and isn't being heard. A resource, including Python source or byte code, could be stored in any manner and found by any kind of loader along the sys.meta_path. This abstraction is emphasised repeatedly in the documentation for the import system. (The point Chris made was more easily isolated in Filipe's message so I quote that.) It is very difficult to shake the idea that packages and modules are exactly filesystem constructs, and likewise that resources are only ever the content of files that sit alongside in the same directories. It is difficult to shake because it almost always how the abstraction is realised in our own work. Also, API exists that seems not to be thought out for any other case, which worries me. But it's not true: can we please try here? The importlib abstraction has semantics that are very like a file system, but differ in critical ways (see namespace packages or how built-ins are found). The appeal for "a directory" runs contrary to this abstraction (in its language). It probably maps to a valid idea in the abstraction, and Chris has tried to get us to think in terms of a "collection" of resources not necessarily always represented by files. I don't understand this well, but I note thathttps://docs.python.org/3/library/importlib.html#module-importlib.resources talks about resource containers. (Unfortunately, it also only uses filesystem-like examples.) Might that be what is asked for? -- Jeff Allen

On Sun, 2022-05-08 at 17:36 +0000, tankimarshal2@gmail.com wrote:
Hi, Can you provide examples of use-cases that would require something like that? files() returns a Traversable, which is a subset of pathlib.Path (in practice, if the module exists on the filesystem it would actually be a pathlib.Path instance), which you can use to access the data. as_file() exists because needing a file on the file system is sort of common, but from my experience the same is not true for directories. While I can see the value in what you propose, I worry it is not a common enough use-case. Some code examples would definitely help out here. Cheers, Filipe Laíns

I think this is a bit of an "XY" feature request. Currently, resources must be individual files inside a package. A directory cannot itself be a "resource". So for example if you have a directory structure like this: my_great_app/ __init__.py something.py data/ assets/ images/ a.png a.png c.png Each of data/ assets/ and images/ must also be a package, with its own __init__.py file. You cannot access the resource data/assets/images/a.png in the package my_great_app, you must access the resource a.png in the package my_great_app.data.assets.images. This is (in my opinion) unintuitive, easy to forget, and moderately annoying. So I think the feature request here is that Python start allowing directories as "resources", rather than just single files within a package. Alternatively, if for some reason directories cannot themselves be resources, allowing file resources in subdirectories (without creating a new subpackage) would also be a nice ergonomic improvement. I'm not sure if this poses issues for package resolution, namespace packages, etc. I imagine that this somewhat-obvious feature was omitted for a good reason.

I honestly wasn't aware that this was added in 3.7 -- I always thought handling resources was up to setuptools or hand-written code. So it's nice to see that it's there in the stdlib. And while I haven't actually used this feature, I have read through the docs and have a few comments. On Sun, May 15, 2022 at 7:21 AM Greg Werbin <outthere@me.gregwerbin.com> wrote:
Currently, resources must be individual files inside a package. A directory cannot itself be a "resource".
According to the docs, " a *resource* is a binary artifact that is shipped within a package." a directory is not a binary artifact -- it can't have actually data in it like a file can. So I'm not sure how it wouel make sense for a directory to be a resource.
It doesn't seem that bad to me -- my thoughts is that if you have that much structure to your resources, maybe the resource system is not the right tool for the job. nevertheless ...
That might be possible, and would make sense to me. However, the entire point of resources is to provide an abstraction -- the individual resources may not be files on disk at all -- so extending the nested path concpet may not make sense. That being said, I also note that importlib.abc.ResourceReader is an ABC -- so I think you could very well make your own ResourceReader that could traverse a directory hierarchy if you wanted to. """ And the contents() method is defined as: Returns an iterable of strings over the contents of the package. Do note that it is not required that all names returned by the iterator be actual resources, e.g. it is acceptable to return names for which is_resource() would be false. Allowing non-resource names to be returned is to allow for situations where how a package and its resources are stored are known a priori and the non-resource names would be useful. For instance, returning subdirectory names is allowed so that when it is known that the package and resources are stored on the file system then those subdirectory names can be used directly. """ which implies to me that the system is expected to optionally handle subdirs. Perhaps you can write a ResourceResader that meets your needs, and it turns out to be generally useful, it could be later added to the stslib. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On 16/05/22 5:05 pm, Christopher Barker wrote:
a directory is not a binary artifact -- it can't have actually data in it like a file can.
and:
These two statements are contradictory. If a resource is an abstraction, why can't it be represented by a collection of files in a directory rather than a single file? -- Greg

On Sun, May 15, 2022 at 11:57 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have no idea why you say that.
If a resource is an abstraction, why can't it be represented by a collection of files in a directory rather than a single file?
if you want to call your collection of files a single resource, then sure -- but then it's not the directory that's the resource, it's the collection of files that's the resource -- again, it's an abstraction. I see this a bit like how you can't add an empty directory to git -- a directory has no information to add -- i.e. the path is not the file, and a directory is only part of a path. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Mon, 2022-05-16 at 18:55 +1200, Greg Ewing wrote:
It's not contradictory. "the individual resources may not be files on disk at all" hints that the resource might be for eg. an entry on a database that contains binary data. Directory are not binary data, they cannot be a resource. Filipe Laíns

Non-"binary" resources are already in widespread use, so perhaps that requirement shouldn't be in the docs at all. In practice, a resource is "any data file other than a Python source file." Moreover, I see no reason why a resource name in general shouldn't be allowed to contain a "/" character! Perhaps the simplest way to implement such a thing is to leave all the existing importlib.resources APIs unchanged, but then add a separate function that grants access to collections of resources within a package. Maybe this is as simple as adding a `dir=` kwarg to importlib.resources.files(), defaulting to `dir=None`.

On Mon, May 16, 2022 at 7:38 AM Greg Werbin <outthere@me.gregwerbin.com> wrote:
Non-"binary" resources are already in widespread use,
I didn't write the docs -- but a text file is, indeed a binary blob -- it only becomes text when it is read and decoded, so there is no distinction at this level. I *think* the idea behind that is that a resource is not necessarily a file on disk -- which is why it's not called a file. It is a blob of bytes (i.e. binary) that is stored somehow and can be retrieved somehow. In practice, I imagine most resources are indeed files, or at least start that way before they are packed into a zip archive or something.
so perhaps that requirement shouldn't be in the docs at all. In practice, a resource is "any data file other than a Python source file."
See above -- a resource may not be a file.
Moreover, I see no reason why a resource name in general shouldn't be allowed to contain a "/" character!
I agree here -- this may very well be the solution -- have any of you tested it? I have not -- it may well be possible right now. Is there any code checking and rejecting resource names with a slash in them? BTW -- can someone point to a good example of resources in action -- I find it a bit challenging to work only fjrom documentation -- I really like concrete examples. Another note: IIUC, the entire point of the "resource" abstraction is because they may not be files on disk (at least at run time). I, and many others have included data file in packages in a very simple way: 1) Add the file to the directory where you want it. 2) Make sure your package builder (e.g. setuptools) copies the file into the package on installation 3) Find the file at run time with a relative path to the __fille_ attribute in a parent package. This is simple, straightforward and works great -- as long as your package is only used with a conventional install (set zipsafe to False for setuptools. But it breaks when the package is installed some other way -- as a zip file, maybe by PyInstaller, or who knows what? I'm pretty sure THAT is the problem that importlib resources is trying to solve in an abstract way. From: https://importlib-resources.readthedocs.io/en/latest/using.html """ What exactly do we mean by “a resource”? It’s easiest to think about the metaphor of files and directories on the file system, though it’s important to keep in mind that this is just a metaphor. Resources and packages *do not* have to exist as physical files and directories on the file system. """ Note from that document, there is this example: from importlib_resources import files # Reads contents with UTF-8 encoding and returns str. eml = files('email.tests.data').joinpath('message.eml').read_text() In that case, there is a directory, `data`, that has a __init__.py file in it, making it a package. Note that you still need to use joinpath to make the final path to your resource -- what happens if you add a nested path there? eml = files('email.tests.data').joinpath('subdir/message.eml').read_text() It may "just work" :-) Keep in mind that the other point of importlib.resources is to leverage the import system -- the import system defines a package as a dir with a __init__ -- and it already knows how to find files (or abstactions of files) in a package. There is no such thing as a "resource directory" as distinct from a package -- and I don't think adding that complication is worth saving the effort of adding a __init__.py to a dir in which you want to store resources. After all, if you want to put modules in a nested dir, you have to add a __init__.py to the dir as well. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Mon, 2022-05-16 at 14:36 +0000, Greg Werbin wrote:
No, Python source files are resources too. "resource" is an abstract concept akin to files, its propose is to allow support other use-cases than just files on the OS file system (eg. zip file, tarball, database). Adding a "directory" reference goes against the purpose of abstracting the FS away. Packages are akin to directories, files are akin to resources, when operating on a FS. Cheers, Filipe Laíns

I think I might have confused things further, my apologies. I see 3 related but distinct feature requests here: 1) Allow a directory to be a resource in and of itself, and provide APIs for working with "directory-type" resources. 2) Allow resources with "/" in the name, and provide APIs to interact with such resources as if the were filesystem directories, remaining agnostic as to whether they actually are filesystem directories, to the best extent possible. This is similar to how cloud storage providers like Amazon S3 handle slashes in names. This would mean that you can put resources in subdirectories of package directories that are not themselves subpackages. 3) Provide APIs for working with resources within packages in a fashion that more closely resembles directory and subdirectory relationships. I am not sure exactly what this would entail. As discussed above, (1) is a significant deviation from the existing model of "resources" as described in the docs and implemented in importlib.resources, so it's probably not a viable option. I am not sure what (3) would consist of, but I imagine that there is continued room for improvement over importlib.resources.files(). (2) seems like a useful feature, if only because it seems more natural to write "assets/images/foo.png" in resource "my_app.data", as opposed to "foo.png" in resource "my_app.data.assets.images". But as stated above, maybe it's too little benefit to warrant changing the model where in a directory is analogous to a package.

On Tue, May 17, 2022 at 1:46 PM Greg Werbin <outthere@me.gregwerbin.com> wrote:
1) Allow a directory to be a resource in and of itself, and provide APIs for working with "directory-type" resources.
I'm still confused here -- what would a "directory be a resource" even mean? A directory is not a file, it is a "container" of files. I still have no idea what people mean when they say a directory could be a resource. If we go back to the OPs point, they were saying that they didn't like that you have to put an __init__.py in a dir in order to put resources in it. And I get that idea, but that wouldn't be making a dir a resource, it would be allowing a directory structure inside a package in which to put resources. Remember that the directory that holds a package (has an __init__.py file) is not itself a package, if anything, the __init__.py file is the resource here.
Sure, this could make sense -- but a point here: The whole "resource" thing in import lib is, well, part of the import system. IIUC, it is provided so that the existing system built for loading modules can be used for things other than Python modules. It was not designed to be general purpose. No one seems to mind that with the following file structure: pkg/ __init__.py a_module.py a_subdir/ something.py you can't do: from pkg.a_subdir import something or from pkg import a_subdir/something I think one reason a dir has to have a __init__.py to be a package (and to hold other modules/packages) is that a dir is NOT a file, i.e. not a resource, i.e. has no way to hold any information. The __init__.py is needed so that when you import a package, there's a way to put some names in that namespace. That is -- the directory itself can not be a package -- it doesn't have a "binary blob" at all.
Isn't that the same as (2) ? As discussed above, (1) is a significant deviation from the existing model
of "resources" as described in the docs and implemented in importlib.resources, so it's probably not a viable option.
Exactly -- if you want a more rich resource handling system, maybe look to a third party package. How does setuptools' pkg_resources handle all this for instance?
I'm confused again -- in this case, foo.png is a resource, my_app.data is not a resource, and nor is, or will be, assets/images.
I think the way it works now is that you'd get the resource something like this:
To get the path to a actual file:
files(my_app.data.assets.images).joinpath('foo.png')
(or my prefered syntax:) files(my_app.data.assets.images) / ('foo.png') To get the binary data (which I think is the whole point to this):
foo_img = resource_bytes('my_app.data.assets.images', 'foo.png')
of course, you can first do: from my_app.data.assets import images and then: foo_img = resource_bytes(images, 'foo.png') In order for that to work, the data, assets, and images dirs all need to have an __init__.py file, because the first argument to resource_bytes has to be a package. Again, is that really such a heavy lift ? But I'm a bit confused as to what's being proposed here. If dirs with only non-python module resources didn't need a __init__.py, would you be able to access them the same way? or are we thinking it would be: foo_img = resource_bytes('my_app.data','assets/images/foo.png') which probably wouldn't work today (untested). However, if you only want to support actual files on disk, this would work: foo_img = open(files(my_app.data) / 'assets/images/foo.png', 'b').read() It might take some machinations to get your build system to copy those files in the right place, but it should work just fine. By the way: In [5]: resources.is_resource? Signature: resources.is_resource(package: Union[str, module], name: str) -> bool Docstring: True if 'name' is a resource inside 'package'. Directories are *not* resources. And check this out: In [6]: resources.is_resource('importlib','__init__.py') Out[6]: True So the __init__.py file is a resource. and: In [9]: resources.is_resource('importlib','metadata') Out[9]: False metadata is a sub-package inside importlib, it has an __init__.py, but it is not a resource itself. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Tue, 2022-05-17 at 23:04 -0700, Christopher Barker wrote:
And this is no longer true as of importlib_resources 3.2.0 or Python 3.10, as we added support for resources in namespace packages. https://github.com/python/importlib_resources/pull/196 I agree with the point that directories are not resources, in fact: https://github.com/python/importlib_resources/blob/main/importlib_resources/... Cheers, Filipe Laíns

On 16/05/2022 23:22, Filipe Laíns wrote:
This is a very important point that Christopher Barker has also made several times and isn't being heard. A resource, including Python source or byte code, could be stored in any manner and found by any kind of loader along the sys.meta_path. This abstraction is emphasised repeatedly in the documentation for the import system. (The point Chris made was more easily isolated in Filipe's message so I quote that.) It is very difficult to shake the idea that packages and modules are exactly filesystem constructs, and likewise that resources are only ever the content of files that sit alongside in the same directories. It is difficult to shake because it almost always how the abstraction is realised in our own work. Also, API exists that seems not to be thought out for any other case, which worries me. But it's not true: can we please try here? The importlib abstraction has semantics that are very like a file system, but differ in critical ways (see namespace packages or how built-ins are found). The appeal for "a directory" runs contrary to this abstraction (in its language). It probably maps to a valid idea in the abstraction, and Chris has tried to get us to think in terms of a "collection" of resources not necessarily always represented by files. I don't understand this well, but I note thathttps://docs.python.org/3/library/importlib.html#module-importlib.resources talks about resource containers. (Unfortunately, it also only uses filesystem-like examples.) Might that be what is asked for? -- Jeff Allen
participants (7)
-
Chris Angelico
-
Christopher Barker
-
Filipe Laíns
-
Greg Ewing
-
Greg Werbin
-
Jeff Allen
-
tankimarshal2@gmail.com