From raulcumplido at gmail.com Tue Nov 17 05:18:19 2015 From: raulcumplido at gmail.com (=?UTF-8?Q?Ra=C3=BAl_Cumplido?=) Date: Tue, 17 Nov 2015 10:18:19 +0000 Subject: [Import-SIG] DeprecationWarning for Python3.6 - Loaders create_module() Message-ID: Hi all, I am preparing a talk for PyCon ES (Spanish PyCon) about the import machinery and I was taking a look on the importlib/_bootstrap.py#module_from_spec I see there is a Deprecation warning as on Python3.6 loaders defining exec module must also define create module, but there is also a comment that says: "Typically loaders will not implement create_module()" I see that a little bit contradictory, what are the expected changes for the new Python3.6 version? Are the existing Loaders going to be changed to implement create_module? Sorry if this has been explained before but I cannot find in the archives the rationale for that. Kind Regards, Raul -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Nov 17 12:41:02 2015 From: brett at python.org (Brett Cannon) Date: Tue, 17 Nov 2015 17:41:02 +0000 Subject: [Import-SIG] DeprecationWarning for Python3.6 - Loaders create_module() In-Reply-To: References: Message-ID: On Tue, 17 Nov 2015 at 02:30 Ra?l Cumplido wrote: > Hi all, > > I am preparing a talk for PyCon ES (Spanish PyCon) about the import > machinery and I was taking a look on the > importlib/_bootstrap.py#module_from_spec > > I see there is a Deprecation warning as on Python3.6 loaders defining exec > module must also define create module, but there is also a comment that > says: "Typically loaders will not implement create_module()" > > I see that a little bit contradictory, > So it's not contradictory if you happen to realize the implicit ending to that comment is "... because importlib.abc.Loader defines the method." > what are the expected changes for the new Python3.6 version? > Loaders must implement create_module() if they define exec_module(). > Are the existing Loaders going to be changed to implement create_module? > They all have since Python 3.5; remember that create_module() can return None and that will follow default semantics, so defining the method can be as short as `def create_module(self): pass`. > > Sorry if this has been explained before but I cannot find in the archives > the rationale for that. > https://bugs.python.org/issue23014 -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Nov 20 16:23:56 2015 From: brett at python.org (Brett Cannon) Date: Fri, 20 Nov 2015 21:23:56 +0000 Subject: [Import-SIG] Proposed design for importlib.resources() Message-ID: I have created a Jupyter Notebook to explain my thinking on what importlib.resources() should be (at least initially). You can view the notebook at http://nbviewer.jupyter.org/gist/brettcannon/9c4681a77a7fa09c5347 or download it and play with the code live in your own copy (you can download Anaconda 2.4 if you don't have Jupyter already set up under Python 3.5: https://www.continuum.io/downloads; I have filed https://github.com/binder-project/binder/issues/38 to try and get mybinder.org updated to Python 3.5 so that can be used instead). The notebook is a bit long and is much better formatted elsewhere, so I'm not going to inline it here. If you want to comment on the notebook just copy and paste the relevant part into your reply. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 20 23:01:17 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 21 Nov 2015 14:01:17 +1000 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: Message-ID: On 21 November 2015 at 07:23, Brett Cannon wrote: > I have created a Jupyter Notebook to explain my thinking on what > importlib.resources() should be (at least initially). You can view the > notebook at > http://nbviewer.jupyter.org/gist/brettcannon/9c4681a77a7fa09c5347 or > download it and play with the code live in your own copy (you can download > Anaconda 2.4 if you don't have Jupyter already set up under Python 3.5: > https://www.continuum.io/downloads; I have filed > https://github.com/binder-project/binder/issues/38 to try and get > mybinder.org updated to Python 3.5 so that can be used instead). > > The notebook is a bit long and is much better formatted elsewhere, so I'm > not going to inline it here. If you want to comment on the notebook just > copy and paste the relevant part into your reply. > The general usage API design looks good to me, but the current proposal for retrieving the resource reader uses loader_state incorrectly - that's defined in PEP 451 as an opaque object from the import machinery's point of view, so there's no requirement for it to be a mapping. Instead, "resource_reader" either needs to be a new optional attribute on the module spec, or else a new optional method on the Loader API. My preference is for the latter, as that way we'll never create resource reader instances for the vast majority of modules, while with the current proposal we'd create a reader instance for every module *spec* constructed, even if nothing in the application uses the new resource access API. Some other smaller notes: * The notebook reports the result of your straw poll incorrectly - you say "approach 2" won out, but "approach 1" (the object oriented one) did (by a 4:1 margin). * In relation to sharing files, there are actually options we can pass to CreateFile to make it possible to open a temporary file by name even while we keep the original handle open: https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858%28v=vs.85%29.aspx * There's also an open issue discussing the significant limitations of tempfile.NamedTemporaryFile on Windows: http://bugs.python.org/issue14243 I don't think either of those notes about shared file access on Windows affect your proposed solution, I just think they're worth referencing. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat Nov 21 11:19:56 2015 From: brett at python.org (Brett Cannon) Date: Sat, 21 Nov 2015 16:19:56 +0000 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: Message-ID: On Fri, 20 Nov 2015 at 20:01 Nick Coghlan wrote: > On 21 November 2015 at 07:23, Brett Cannon wrote: > >> I have created a Jupyter Notebook to explain my thinking on what >> importlib.resources() should be (at least initially). You can view the >> notebook at >> http://nbviewer.jupyter.org/gist/brettcannon/9c4681a77a7fa09c5347 or >> download it and play with the code live in your own copy (you can download >> Anaconda 2.4 if you don't have Jupyter already set up under Python 3.5: >> https://www.continuum.io/downloads; I have filed >> https://github.com/binder-project/binder/issues/38 to try and get >> mybinder.org updated to Python 3.5 so that can be used instead). >> >> The notebook is a bit long and is much better formatted elsewhere, so I'm >> not going to inline it here. If you want to comment on the notebook just >> copy and paste the relevant part into your reply. >> > > The general usage API design looks good to me, but the current proposal > for retrieving the resource reader uses loader_state incorrectly - that's > defined in PEP 451 as an opaque object from the import machinery's point of > view, so there's no requirement for it to be a mapping. Instead, > Dammit, it was so convenient! > "resource_reader" either needs to be a new optional attribute on the > module spec, or else a new optional method on the Loader API. > Yep. > > My preference is for the latter, as that way we'll never create resource > reader instances for the vast majority of modules, while with the current > proposal we'd create a reader instance for every module *spec* constructed, > even if nothing in the application uses the new resource access API. > I'll have to think about that. I'm not that worried about memory pressure from every module having a resource reader object (it's not like people import literally a million modules; I have never heard more than in the thousands). We could introduce Loader.resources(name) or __spec__.resources depending on how this plays out (people are welcome to provide feedback on which way they prefer). > > Some other smaller notes: > > * The notebook reports the result of your straw poll incorrectly - you say > "approach 2" won out, but "approach 1" (the object oriented one) did (by a > 4:1 margin). > Typo fixed in my copy. > * In relation to sharing files, there are actually options we can pass to > CreateFile to make it possible to open a temporary file by name even while > we keep the original handle open: > https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858%28v=vs.85%29.aspx > * There's also an open issue discussing the significant limitations of > tempfile.NamedTemporaryFile on Windows: http://bugs.python.org/issue14243 > > I don't think either of those notes about shared file access on Windows > affect your proposed solution, I just think they're worth referencing. > Yes, I'm ignoring everything you just said. :) If we were supporting the return of objects then I might care, but since I'm trying to keep the API surface small to start and thus not doing file objects this doesn't really play into this. If we add an open() method then these issues will be something we need to potentially care about. -Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 22 21:27:29 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 Nov 2015 12:27:29 +1000 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: Message-ID: On 22 November 2015 at 02:19, Brett Cannon wrote: > On Fri, 20 Nov 2015 at 20:01 Nick Coghlan wrote: >> The general usage API design looks good to me, but the current proposal >> for retrieving the resource reader uses loader_state incorrectly - that's >> defined in PEP 451 as an opaque object from the import machinery's point of >> view, so there's no requirement for it to be a mapping. Instead, > > Dammit, it was so convenient! We underspecified it precisely to avoid the temptation to give it semantic significance instead of treating it as an opaque reference allowing custom Importers to pass data to custom Loaders :) >> "resource_reader" either needs to be a new optional attribute on the >> module spec, or else a new optional method on the Loader API. > > Yep. > >> My preference is for the latter, as that way we'll never create resource >> reader instances for the vast majority of modules, while with the current >> proposal we'd create a reader instance for every module *spec* constructed, >> even if nothing in the application uses the new resource access API. > > I'll have to think about that. I'm not that worried about memory pressure > from every module having a resource reader object (it's not like people > import literally a million modules; I have never heard more than in the > thousands). We could introduce Loader.resources(name) or __spec__.resources > depending on how this plays out (people are welcome to provide feedback on > which way they prefer). My preference is to have it as a Loader method API. I actually have a concrete technical rationale for that, too: the loader concept predates the module spec concept by more than a decade, so folks customising the import system will necessarily have the ability to cope with module loaders. By contrast, module specs can only be relied on in Python 3 code, or in Python 2 code that's willing to introduce a hard dependency on the use of importlib2 as the import system. By instead having the resource reader as a (dynamically created?) subcomponent of the loader, we get to keep the overall structure of import system manipulation code the same - resource readers wouldn't be a new concept existing at the same level as module loaders, they'd be a new feature *of* module loaders. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Mon Nov 23 15:45:55 2015 From: barry at python.org (Barry Warsaw) Date: Mon, 23 Nov 2015 15:45:55 -0500 Subject: [Import-SIG] Proposed design for importlib.resources() References: Message-ID: <20151123154555.317b216e@anarchist.wooz.org> On Nov 20, 2015, at 09:23 PM, Brett Cannon wrote: >I have created a Jupyter Notebook to explain my thinking on what >importlib.resources() should be (at least initially). Just a few thoughts based on a review of two projects' use of pkg_resources. +1 on getting *something* into Python 3.6. Module API vs package API. Doesn't pkg_resources actually support something similar to both, with the module function providing a convenience API? I like that a lot because the convenience API is so darn... convenient! You just give it the Python dotted-path and the resource and it does the rest. Generally I don't care about caching the results of the search; these calls are almost never in performance critical code. resource_filename(). Doesn't pkg_resources already have a strategy for the temporary file that sometimes has to be created? Maybe it doesn't work so well on some platforms (I've never noticed a problem on *nix). A context manager as proposed seems like the most reasonable approach. We definitely need this API though. I see plenty of examples where e.g. test data files have to be shutil.copy()'d, passed to subprocess command line arguments, etc. read_bytes(). Thank you for the truth in advertising! IIRC in Python 3, pkg_resource.resource_string() actually returns bytes. from-import-as to the rescue. An actual resource_string() would have to accept an encoding argument (as would any resource-based open() method). resource_stream(). IIRC, the pkg_resource's version is not a context manager so it has to be closed explicitly (or wrapped in contextlib.closing()). We can do better. I do have one use of resource_listdir() which is used to find importable plugin modules at runtime. It's handy. That's all for now. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From brett at python.org Tue Nov 24 15:21:40 2015 From: brett at python.org (Brett Cannon) Date: Tue, 24 Nov 2015 20:21:40 +0000 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: <20151123154555.317b216e@anarchist.wooz.org> References: <20151123154555.317b216e@anarchist.wooz.org> Message-ID: On Mon, 23 Nov 2015 at 13:46 Barry Warsaw wrote: > On Nov 20, 2015, at 09:23 PM, Brett Cannon wrote: > > >I have created a Jupyter Notebook to explain my thinking on what > >importlib.resources() should be (at least initially). > > Just a few thoughts based on a review of two projects' use of > pkg_resources. > +1 on getting *something* into Python 3.6. > > Module API vs package API. Doesn't pkg_resources actually support > something > similar to both, with the module function providing a convenience API? Yes, but that doesn't sway me. This isn't a "pkg_resources++" but a "make reading data from a package make sense in a modern import world". IOW I'm purposefully not using pkg_resources as a template but simply as a motivating factor. > I like > that a lot because the convenience API is so darn... convenient! You just > give it the Python dotted-path and the resource and it does the rest. > I don't see how that's any different than the other approach since you're still providing the exact same data; no more, no less. > Generally I don't care about caching the results of the search; these calls > are almost never in performance critical code. > Unfortunately for you the poll liked the other approach and TOOWTDI. So either convince me that resources.read_bytes(pkg, path) is better than resources(pkg).read_bytes(path) or consider the bike shed painted. :) > > resource_filename(). Doesn't pkg_resources already have a strategy for the > temporary file that sometimes has to be created? Yes and I don't like it. :) Basically you either create an instance or implicitly use a global instance of a class that stores the references and registers with atexit a cleanup function to be executed. > Maybe it doesn't work so > well on some platforms (I've never noticed a problem on *nix). A context > manager as proposed seems like the most reasonable approach. We definitely > need this API though. I see plenty of examples where e.g. test data files > have to be shutil.copy()'d, passed to subprocess command line arguments, > etc. > OK, between you and Donald saying you have real needs for the API you can rest assured that it will be in the initial version, especially since I already coded up the tempfile implementation. > > read_bytes(). Thank you for the truth in advertising! IIRC in Python 3, > pkg_resource.resource_string() actually returns bytes. from-import-as to > the > rescue. An actual resource_string() would have to accept an encoding > argument > (as would any resource-based open() method). > Yes, which is why I don't think it's worth it to provide a resource_strinng() since calling decode isn't difficult (and is something you must know in Python 3). > > resource_stream(). IIRC, the pkg_resource's version is not a context > manager > so it has to be closed explicitly (or wrapped in contextlib.closing()). We > can do better. > I'm not convinced it's necessary to provide an equivalent open() yet; if you have an API that requires a file-like object then io.BytesIO to the rescue for read_bytes(). There is nothing tricky to get right like with a file path that may or may not be backed by a temporary file. This is a somewhat low-level API and if people want to provide convenience wrappers that's fine but I don't want to start guessing at needs beyond core APIs or ones that are hard to get right and allow for composability to higher APIs like file-like objects which others can handle. > > I do have one use of resource_listdir() which is used to find importable > plugin modules at runtime. It's handy. > I'm going to punt on this for as long as possible because it's asking for trouble to get right. For example, if I do resources(pkg).listdir(), then I will end up returning relative paths, but if you disassociate those paths from pkg then you have lost proper context. You could return tuples of (pkg, relative_path), but that just doesn't seem satisfactory either. I'm just not convinced yet it is needed enough to support (at least initially). -Brett > > That's all for now. > -Barry > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > https://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Tue Nov 24 15:47:10 2015 From: donald at stufft.io (Donald Stufft) Date: Tue, 24 Nov 2015 15:47:10 -0500 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: <20151123154555.317b216e@anarchist.wooz.org> Message-ID: <80216760-9F2A-430C-8E86-02DB4F98D622@stufft.io> > On Nov 24, 2015, at 3:21 PM, Brett Cannon wrote: > > > resource_filename(). Doesn't pkg_resources already have a strategy for the > temporary file that sometimes has to be created? > > Yes and I don't like it. :) Basically you either create an instance or implicitly use a global instance of a class that stores the references and registers with atexit a cleanup function to be executed. In pkg_resources they aren?t actually temporary files but are instead cached files which are stored in a non temporary location per user and will be recreated if need be. It would probably be sane to make this context manager like the ones in tempfile where the creation happens in __init__ and the deletion happens in __exit__ (with also a .delete() function) so people can choose how to use it. If you wanted something more like what pkg_resources does now, you could do: filename = importlib.resources.get_filename(?mypackage?, ?some/path.txt?, dir=?~/.cache/mypackage?) ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From barry at python.org Tue Nov 24 16:05:02 2015 From: barry at python.org (Barry Warsaw) Date: Tue, 24 Nov 2015 16:05:02 -0500 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: <20151123154555.317b216e@anarchist.wooz.org> Message-ID: <20151124160502.2cf07d4d@anarchist.wooz.org> On Nov 24, 2015, at 08:21 PM, Brett Cannon wrote: >> Module API vs package API. Doesn't pkg_resources actually support >> something similar to both, with the module function providing a convenience >> API? > >Yes, but that doesn't sway me. This isn't a "pkg_resources++" but a "make >reading data from a package make sense in a modern import world". IOW I'm >purposefully not using pkg_resources as a template but simply as a >motivating factor. You're not taking into account a migration path for existing users of pkg_resources. If you make it difficult to convert, then people are much less likely to do it for existing code, despite the ability to remove a dependency. If my existing code already has from pkg_resources import resource_string as resource_bytes all I'd need to do is change this one line to from importlib.resource import read_bytes as resource_bytes and I'm done. If I need to support multiple versions of Python, I can even do: try: from importlib.resource import read_bytes as resource_bytes except ImportError: from pkg_resources import resource_string as resource_bytes Without this API, it's much more difficult for me to convert my existing code, either incrementally or whole-hog, because now I have to either add that convenience function myself (and import it everywhere) or rewrite all my call sites. Why bother? >Unfortunately for you the poll liked the other approach and TOOWTDI. So >either convince me that resources.read_bytes(pkg, path) is better than >resources(pkg).read_bytes(path) or consider the bike shed painted. :) It's not better or worse, it's just different. As pkg_resources has shown, it doesn't have to be either-or. I never saw the poll since I don't pay attention to Google+. How representative are those 59 votes of the current pkg_resource users and potential future users of this API? If I had seen the poll I would have complained that it didn't give me a chance to choose both APIs . >I'm not convinced it's necessary to provide an equivalent open() yet; Right, I'm not necessarily advocating for it, just describing what it would have to do if it were there. It's something I occasionally wish I had, but all the building blocks are there to invent it when needed. >> I do have one use of resource_listdir() which is used to find importable >> plugin modules at runtime. It's handy. > >I'm going to punt on this for as long as possible because it's asking for >trouble to get right. For example, if I do resources(pkg).listdir(), then I >will end up returning relative paths, but if you disassociate those paths >from pkg then you have lost proper context. You could return tuples of >(pkg, relative_path), but that just doesn't seem satisfactory either. I'm >just not convinced yet it is needed enough to support (at least initially). It's a tougher API to recreate from the building blocks, so it would be nice not to have to reinvent the wheel everywhere, but it's also a much less common API. I'm not at all worried about the disassociation problem, since os.listdir() gives you relative paths anyway so it's a familiar behavior. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From donald at stufft.io Tue Nov 24 16:37:14 2015 From: donald at stufft.io (Donald Stufft) Date: Tue, 24 Nov 2015 16:37:14 -0500 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: <20151124160502.2cf07d4d@anarchist.wooz.org> References: <20151123154555.317b216e@anarchist.wooz.org> <20151124160502.2cf07d4d@anarchist.wooz.org> Message-ID: > On Nov 24, 2015, at 4:05 PM, Barry Warsaw wrote: > > On Nov 24, 2015, at 08:21 PM, Brett Cannon wrote: > >>> Module API vs package API. Doesn't pkg_resources actually support >>> something similar to both, with the module function providing a convenience >>> API? >> >> Yes, but that doesn't sway me. This isn't a "pkg_resources++" but a "make >> reading data from a package make sense in a modern import world". IOW I'm >> purposefully not using pkg_resources as a template but simply as a >> motivating factor. > > You're not taking into account a migration path for existing users of > pkg_resources. If you make it difficult to convert, then people are much less > likely to do it for existing code, despite the ability to remove a dependency. > > If my existing code already has > > from pkg_resources import resource_string as resource_bytes > > all I'd need to do is change this one line to > > from importlib.resource import read_bytes as resource_bytes > > and I'm done. If I need to support multiple versions of Python, I can even > do: > > try: > from importlib.resource import read_bytes as resource_bytes > except ImportError: > from pkg_resources import resource_string as resource_bytes > > Without this API, it's much more difficult for me to convert my existing code, > either incrementally or whole-hog, because now I have to either add that > convenience function myself (and import it everywhere) or rewrite all my call > sites. Why bother? try: from importlib import resources resource_bytes = lambda m, r: resources(m).read_bytes(r) except ImportError: from pkg_resources import resource_string as resource_bytes I kind of agree though that given there isn?t a major difference otherwise, that making it easier to port code to the new API is a useful thing. > >> Unfortunately for you the poll liked the other approach and TOOWTDI. So >> either convince me that resources.read_bytes(pkg, path) is better than >> resources(pkg).read_bytes(path) or consider the bike shed painted. :) > > It's not better or worse, it's just different. As pkg_resources has shown, it > doesn't have to be either-or. > > I never saw the poll since I don't pay attention to Google+. How > representative are those 59 votes of the current pkg_resource users and > potential future users of this API? If I had seen the poll I would have > complained that it didn't give me a chance to choose both APIs . > >> I'm not convinced it's necessary to provide an equivalent open() yet; > > Right, I'm not necessarily advocating for it, just describing what it would > have to do if it were there. It's something I occasionally wish I had, but > all the building blocks are there to invent it when needed. > >>> I do have one use of resource_listdir() which is used to find importable >>> plugin modules at runtime. It's handy. >> >> I'm going to punt on this for as long as possible because it's asking for >> trouble to get right. For example, if I do resources(pkg).listdir(), then I >> will end up returning relative paths, but if you disassociate those paths >> from pkg then you have lost proper context. You could return tuples of >> (pkg, relative_path), but that just doesn't seem satisfactory either. I'm >> just not convinced yet it is needed enough to support (at least initially). > > It's a tougher API to recreate from the building blocks, so it would be nice > not to have to reinvent the wheel everywhere, but it's also a much less common > API. I'm not at all worried about the disassociation problem, since > os.listdir() gives you relative paths anyway so it's a familiar behavior. > > Cheers, > -Barry > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > https://mail.python.org/mailman/listinfo/import-sig ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From brett at python.org Tue Nov 24 18:25:50 2015 From: brett at python.org (Brett Cannon) Date: Tue, 24 Nov 2015 23:25:50 +0000 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: <20151124160502.2cf07d4d@anarchist.wooz.org> References: <20151123154555.317b216e@anarchist.wooz.org> <20151124160502.2cf07d4d@anarchist.wooz.org> Message-ID: On Tue, 24 Nov 2015 at 14:13 Barry Warsaw wrote: > On Nov 24, 2015, at 08:21 PM, Brett Cannon wrote: > > >> Module API vs package API. Doesn't pkg_resources actually support > >> something similar to both, with the module function providing a > convenience > >> API? > > > >Yes, but that doesn't sway me. This isn't a "pkg_resources++" but a "make > >reading data from a package make sense in a modern import world". IOW I'm > >purposefully not using pkg_resources as a template but simply as a > >motivating factor. > > You're not taking into account a migration path for existing users of > pkg_resources. If you make it difficult to convert, then people are much > less > likely to do it for existing code, despite the ability to remove a > dependency. > It's one of those situations where it's balancing future code with migrating old code. I'm doing this to solve the problem of importlib lacking any standardized way to get at resources in a package, not to migrate pkg_resources users who want to eliminate that dependency (although that would be a perk). > > If my existing code already has > > from pkg_resources import resource_string as resource_bytes > > all I'd need to do is change this one line to > > from importlib.resource import read_bytes as resource_bytes > > and I'm done. If I need to support multiple versions of Python, I can even > do: > > try: > from importlib.resource import read_bytes as resource_bytes > except ImportError: > from pkg_resources import resource_string as resource_bytes > > Without this API, it's much more difficult for me to convert my existing > code, > either incrementally or whole-hog, because now I have to either add that > convenience function myself (and import it everywhere) or rewrite all my > call > sites. Why bother? > Nothing is stopping people from writing their own pkg_resources compatibility layer. Hell, I will promise to create shim_resources or something and put it on PyPI for those that want a really simple migration path. But the base API that goes into the stdlib and will need to be supported forever doesn't need to go the way of compatibility if its going to feel out of place in importlib (which the pkg_resources API will, especially if we put a method on loaders to get a resource loader). > > >Unfortunately for you the poll liked the other approach and TOOWTDI. So > >either convince me that resources.read_bytes(pkg, path) is better than > >resources(pkg).read_bytes(path) or consider the bike shed painted. :) > > It's not better or worse, it's just different. As pkg_resources has > shown, it > doesn't have to be either-or. > > I never saw the poll since I don't pay attention to Google+. Which is why I also linked to it on Twitter. :) I actually tried to do it on twitter initially but guess whose poll support restricts option lengths so much you can't type a method call out. :p How representative are those 59 votes of the current pkg_resource users and potential future users of this API? If I had seen the poll I would have complained that it didn't give me a chance to choose both APIs . >I'm not convinced it's necessary to provide an equivalent open() yet; Right, I'm not necessarily advocating for it, just describing what it would have to do if it were there. It's something I occasionally wish I had, but all the building blocks are there to invent it when needed. >> I do have one use of resource_listdir() which is used to find importable >> plugin modules at runtime. It's handy. > >I'm going to punt on this for as long as possible because it's asking for >trouble to get right. For example, if I do resources(pkg).listdir(), then I >will end up returning relative paths, but if you disassociate those paths >from pkg then you have lost proper context. You could return tuples of >(pkg, relative_path), but that just doesn't seem satisfactory either. I'm >just not convinced yet it is needed enough to support (at least initially). It's a tougher API to recreate from the building blocks, so it would be nice not to have to reinvent the wheel everywhere, but it's also a much less common API. I'm not at all worried about the disassociation problem, since os.listdir() gives you relative paths anyway so it's a familiar behavior. Yeah, I realize it's something you can't make from scratch, but I'm still going to avoid it while I can because as soon as this goes in then people are going to want a similar API for discovering modules in a package and would abuse this API if they don't get the other API. -brett Cheers, -Barry _______________________________________________ Import-SIG mailing list Import-SIG at python.org https :// mail.python.org /mailman/ listinfo / import-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Nov 24 19:28:18 2015 From: brett at python.org (Brett Cannon) Date: Wed, 25 Nov 2015 00:28:18 +0000 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: <20151123154555.317b216e@anarchist.wooz.org> <20151124160502.2cf07d4d@anarchist.wooz.org> Message-ID: If we make it e.g., __loader__.resources().read_bytes(path) then I may be more amenable to creating a importlib.resources module with the bastardized pkg_resources API. Going to have to think about it, though. On Tue, 24 Nov 2015, 16:25 Brett Cannon wrote: > On Tue, 24 Nov 2015 at 14:13 Barry Warsaw wrote: > >> On Nov 24, 2015, at 08:21 PM, Brett Cannon wrote: >> >> >> Module API vs package API. Doesn't pkg_resources actually support >> >> something similar to both, with the module function providing a >> convenience >> >> API? >> > >> >Yes, but that doesn't sway me. This isn't a "pkg_resources++" but a "make >> >reading data from a package make sense in a modern import world". IOW I'm >> >purposefully not using pkg_resources as a template but simply as a >> >motivating factor. >> >> You're not taking into account a migration path for existing users of >> pkg_resources. If you make it difficult to convert, then people are much >> less >> likely to do it for existing code, despite the ability to remove a >> dependency. >> > > It's one of those situations where it's balancing future code with > migrating old code. I'm doing this to solve the problem of importlib > lacking any standardized way to get at resources in a package, not to > migrate pkg_resources users who want to eliminate that dependency (although > that would be a perk). > > >> >> If my existing code already has >> >> from pkg_resources import resource_string as resource_bytes >> >> all I'd need to do is change this one line to >> >> from importlib.resource import read_bytes as resource_bytes >> >> and I'm done. If I need to support multiple versions of Python, I can >> even >> do: >> >> try: >> from importlib.resource import read_bytes as resource_bytes >> except ImportError: >> from pkg_resources import resource_string as resource_bytes >> >> Without this API, it's much more difficult for me to convert my existing >> code, >> either incrementally or whole-hog, because now I have to either add that >> convenience function myself (and import it everywhere) or rewrite all my >> call >> sites. Why bother? >> > > Nothing is stopping people from writing their own pkg_resources > compatibility layer. Hell, I will promise to create shim_resources or > something and put it on PyPI for those that want a really simple migration > path. But the base API that goes into the stdlib and will need to be > supported forever doesn't need to go the way of compatibility if its going > to feel out of place in importlib (which the pkg_resources API will, > especially if we put a method on loaders to get a resource loader). > > >> >> >Unfortunately for you the poll liked the other approach and TOOWTDI. So >> >either convince me that resources.read_bytes(pkg, path) is better than >> >resources(pkg).read_bytes(path) or consider the bike shed painted. :) >> >> It's not better or worse, it's just different. As pkg_resources has >> shown, it >> doesn't have to be either-or. >> >> I never saw the poll since I don't pay attention to Google+. > > > Which is why I also linked to it on Twitter. :) I actually tried to do it > on twitter initially but guess whose poll support restricts option lengths > so much you can't type a method call out. :p > > > > How > representative are those 59 votes of the current pkg_resource users and > potential future users of this API? If I had seen the poll I would have > complained that it didn't give me a chance to choose both APIs . > > >I'm not convinced it's necessary to provide an equivalent open() yet; > > Right, I'm not necessarily advocating for it, just describing what it would > have to do if it were there. It's something I occasionally wish I had, but > all the building blocks are there to invent it when needed. > > >> I do have one use of resource_listdir() which is used to find importable > >> plugin modules at runtime. It's handy. > > > >I'm going to punt on this for as long as possible because it's asking for > >trouble to get right. For example, if I do resources(pkg).listdir(), then > I > >will end up returning relative paths, but if you disassociate those paths > >from pkg then you have lost proper context. You could return tuples of > >(pkg, relative_path), but that just doesn't seem satisfactory either. I'm > >just not convinced yet it is needed enough to support (at least > initially). > > It's a tougher API to recreate from the building blocks, so it would be > nice > not to have to reinvent the wheel everywhere, but it's also a much less > common > API. I'm not at all worried about the disassociation problem, since > os.listdir() gives you relative paths anyway so it's a familiar behavior. > > Yeah, I realize it's something you can't make from scratch, but I'm still > going to avoid it while I can because as soon as this goes in then people > are going to want a similar API for discovering modules in a package and > would abuse this API if they don't get the other API. > > -brett > > Cheers, > -Barry > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > https :// > mail.python.org > /mailman/ > listinfo > / > import-sig > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Nov 25 04:36:15 2015 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Nov 2015 19:36:15 +1000 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: <20151123154555.317b216e@anarchist.wooz.org> <20151124160502.2cf07d4d@anarchist.wooz.org> Message-ID: On 25 November 2015 at 10:28, Brett Cannon wrote: > If we make it e.g., __loader__.resources().read_bytes(path) then I may be > more amenable to creating a importlib.resources module with the bastardized > pkg_resources API. Going to have to think about it, though. I think a compatibility shim on PyPI actually makes more sense, as then you can put the conditional pkg_resources dependency *in the shim*. That is, projects that switch would add a runtime dependency on "pkg_resources_compat" (or whatever name you choose) and then do " import pkg_resources_compat as pkg_resources" On Python 3.5 and earlier versions, pkg_resources_compat would depend on setuptools, and just re-export the pkg_resources APIs. On Python 3.6 and later, it would instead be a compatibility wrapper around the updated import machinery Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Wed Nov 25 10:16:15 2015 From: barry at python.org (Barry Warsaw) Date: Wed, 25 Nov 2015 10:16:15 -0500 Subject: [Import-SIG] Proposed design for importlib.resources() In-Reply-To: References: <20151123154555.317b216e@anarchist.wooz.org> <20151124160502.2cf07d4d@anarchist.wooz.org> Message-ID: <20151125101615.60d8977e@anarchist.wooz.org> On Nov 25, 2015, at 12:28 AM, Brett Cannon wrote: >If we make it e.g., __loader__.resources().read_bytes(path) then I may be >more amenable to creating a importlib.resources module with the bastardized >pkg_resources API. Going to have to think about it, though. I hope it works out. I want a lot of people to use the stdlib API as soon as possible. That not only reduces an external dependency, but helps to exercise the stdlib code much more quickly, exposing any bugs, corner cases, or missing features early. If there's an easy migration path, it's much more likely to be adopted sooner. Case in point: the enum module, which was compatible enough with existing third party APIs that a simple conditional import statement allowed for an early opt-in. A third party shim module has several disadvantages. You're trading one external dependency for another, and on some platform you might even be trading a distro-packaged one for a non-distro-packaged one. You're certainly trading a tried-and-true dependency for a brand new one. Plus, you have to commit to supporting that new PyPI package for a long time, and make it compatible across multiple versions of Python. If you're going to go through all that trouble, why not include the batteries in the first place? It's not like the simple pkg_resources API is a bad API. Since I think you understand where I'm coming from by now, I'll stop belaboring the point. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: