> Are you saying that servers like Nginx or whatever your mini-server uses don’t have a way to blanket ignore files?  That would surprise me, and it seems like a lurking security vulnerability regardless of importlib.resources or __init__.py files.  I would think that you’d want to whitelist file extensions, and that `.py` would not be in that list.

"Whitelisting file extensions" is very uncommon. You just put the files you intend to serve in your static directory, and don't put the files you don't intend to serve there. Mixing code and static data is usually seen as a sign of muddy PHP-like thinking.

From what I can tell, if you wanted to exclude '__init__.py' from Nginx in particular, you would have to write an unconventional Nginx configuration, where you determine whether a path refers to a static file according to a regex that excludes things that end in '__init__.py'. Anything is possible, but this would be a significant discouragement to using importlib.

In practice, Flask's built-in server has its own logic about where to find files (which doesn't involve importlib, and I don't know what it actually does). Tornado appears to ask for an absolute path, so users mostly use __file__ to discover that path.

> Is this a problem you’ve actually encountered or is it theoretical?

I had a situation where I wanted to have files that were both served by Flask as static files, and resources that I could load in my tests. Making this work with pkg_resources took a few tries. It sounds like importlib won't really improve the situation.



On Tue, 15 May 2018 at 16:30 Barry Warsaw <barry@python.org> wrote:
On May 15, 2018, at 14:03, Rob Speer <rspeer@luminoso.com> wrote:

> Consider a mini-Web-server written in Python (there are, of course, lots of these) that needs to serve static files. Users of the Web server will expect to be able to place these static files somewhere relative to the directory their code is in, because the files are version-controlled along with the code. If you make developers configure an absolute path, they'll probably use __file__ anyway to get that path, so that it works on more systems than their own without an installer or a layer of configuration management.

You don’t need an absolute path, since you don’t pass file system paths to importlib.resources, and even if you relative import a module, you can pass that module to the APIs and it will still work, since the loaders know where they got the modules from.

> If I understand the importlib.resources documentation, it won't give you a way of accessing your static files directory unless you place an '__init__.py' file in each subdirectory, and convert conventional locations such as "assets/css/main.css" into path(mypackage.assets.css, 'main.css’).

That is correct.  Note that we’re not necessarily saying that we won’t add hierarchical path support to the `resource` attributes of the various APIs, but they do complicate the semantics and implementation.  It’s also easier to add features if the use cases warrant, than remove features that are YAGNI.

> That's already a bit awkward. But do you even want __init__.py to be in your static directory? Even if you tell the mini-server to ignore __init__.py, when you upgrade to a production-ready server like Nginx and point it at the same directory, it won't know anything about this and it'll serve your __init__.py files as static files, leaking details of your system. So you probably wouldn't do this.

Are you saying that servers like Nginx or whatever your mini-server uses don’t have a way to blanket ignore files?  That would surprise me, and it seems like a lurking security vulnerability regardless of importlib.resources or __init__.py files.  I would think that you’d want to whitelist file extensions, and that `.py` would not be in that list.

Is this a problem you’ve actually encountered or is it theoretical?

> This is one example; there are other examples of non-Python directories that you need to be able to access from Python code, where adding a file named __init__.py to the directory would cause undesired changes in behavior.

Can you provide more examples?

> Again, importlib.resources is a good idea. I will look into using it in the cases where it applies. But the retort of "well, you shouldn't be using __file__" doesn't hold up when sometimes you do need to use __file__, and there's no universal replacement for it.
>
> (Also, every Python programmer I've met who's faced with the decision would choose "well, we need to use __file__, so don't zip things" over "well, we need to zip things, so don't use __file__". Yes, it's bad that Python programmers even have to make this choice, and then on top of that they make the un-recommended choice, but that's how things are.)

We certainly see a ton of __file__ usage, but I’m not sure whether it’s the case because most developers aren’t aware of the implications, don’t know of the alternatives, or just use the simplest thing possible.

Using __file__ in your application, personal web service, or private library is fine.  The problem is exacerbated when you use __file__ in your publicly released libraries, because not only can’t *you* use them in zip files, but nothing that depends on your library can use zip files.  Given how popular pex is (and hopefully shiv will be), that will cause pain up the Python food chain, and it may mean that other people won’t be able to use your library.

It’s certainly a trade-off, but it’s important to keep this in mind.

If hierarchical resource paths are important to you, I invite you to submit an issue to our GitLab project:

https://gitlab.com/python-devs/importlib_resources/issues

Cheers,
-Barry

_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/