Fwd: [mpi4py] Fwd: [Numpy-discussion] Improving Python+MPI import performance
For a long time we have been running into this very problem. I think it would be appropriate to utilize this code on Kraken, Ranger, etc. My implementation suggestion would be to put this in the new startup_tasks, where we determine parallelism. As noted in the docstring it will have to be modified to use mpi4py. Britton or Stephen, this sounds like it's directly up your alley as you run on Kraken the most often. Would one of you be willing to test it out? My feeling is that we could simply suggest that on these systems we use this idiom at the top of scripts (where we assume we distribute this script with yt): from yt.mpi_importer import mpi_import with mpi_import(): from yt.mods import * I think it should recursively watch all the imports. An alternate option would be to insert some of its logic into yt.mods, or even have a second mods file that handles it seamlessly, like: from yt.pmods import * Ideas? -Matt ---------- Forwarded message ---------- From: Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> Date: Fri, Jan 13, 2012 at 3:51 AM Subject: [mpi4py] Fwd: [Numpy-discussion] Improving Python+MPI import performance To: mpi4py@googlegroups.com Cc: Chris Kees <cekees@gmail.com> This looks very interesting, Dag -------- Original Message -------- Subject: [Numpy-discussion] Improving Python+MPI import performance Date: Thu, 12 Jan 2012 17:13:41 -0800 From: Asher Langton <langton2@llnl.gov> Reply-To: Discussion of Numerical Python <numpy-discussion@scipy.org> To: numpy-discussion@scipy.org Hi all, (I originally posted this to the BayPIGgies list, where Fernando Perez suggested I send it to the NumPy list as well. My apologies if you're receiving this email twice.) I work on a Python/C++ scientific code that runs as a number of independent Python processes communicating via MPI. Unfortunately, as some of you may have experienced, module importing does not scale well in Python/MPI applications. For 32k processes on BlueGene/P, importing 100 trivial C-extension modules takes 5.5 hours, compared to 35 minutes for all other interpreter loading and initialization. We developed a simple pure-Python module (based on knee.py, a hierarchical import example) that cuts the import time from 5.5 hours to 6 minutes. The code is available here: https://github.com/langton/MPI_Import Usage, implementation details, and limitations are described in a docstring at the beginning of the file (just after the mandatory legalese). I've talked with a few people who've faced the same problem and heard about a variety of approaches, which range from putting all necessary files in one directory to hacking the interpreter itself so it distributes the module-loading over MPI. Last summer, I had a student intern try a few of these approaches. It turned out that the problem wasn't so much the simultaneous module loads, but rather the huge number of failed open() calls (ENOENT) as the interpreter tries to find the module files. In the MPI_Import module, we have rank 0 perform the module lookups and then broadcast the locations to the rest of the processes. For our real-world scientific applications written in Python and C++, this has meant that we can start a problem and actually make computational progress before the batch allocation ends. If you try out the code, I'd appreciate any feedback you have: performance results, bugfixes/feature-additions, or alternate approaches to solving this problem. Thanks! -Asher _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- You received this message because you are subscribed to the Google Groups "mpi4py" group. To post to this group, send email to mpi4py@googlegroups.com. To unsubscribe from this group, send email to mpi4py+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/mpi4py?hl=en.
Hi Matt,
Britton or Stephen, this sounds like it's directly up your alley as you run on Kraken the most often.
I can give, but wouldn't this help on all computers, or am I misunderstanding things?
from yt.pmods import *
Ideas?
What about analysis modules that import stuff, too? Like halo finding. I guess I have to read more about this, so that's more of a rhetorical question. -- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice)
Hi all, I was able to test it and find a problem with mutually-recursive imports in unittest. I'll update if I can find a resolution. For analysis_modules it should be okay, as if you access them via amods it should already know the location from which to import them. -Matt On Fri, Jan 13, 2012 at 2:29 PM, Stephen Skory <s@skory.us> wrote:
Hi Matt,
Britton or Stephen, this sounds like it's directly up your alley as you run on Kraken the most often.
I can give, but wouldn't this help on all computers, or am I misunderstanding things?
from yt.pmods import *
Ideas?
What about analysis modules that import stuff, too? Like halo finding. I guess I have to read more about this, so that's more of a rhetorical question.
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi again, I've written to the package maintainer, but if you insert this line: while fqname[-1] == ".": fqname = fqname[:-1] at the top of the function definition __import__module__ inside MPI_Import.py, it should work for importing yt.mods. This script works great for me: from MPI_Import import mpi_import with mpi_import(): from yt.mods import * mylog.info("Hello") I think it's ready to test on Kraken/Janus/etc. -Matt On Fri, Jan 13, 2012 at 2:31 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,
I was able to test it and find a problem with mutually-recursive imports in unittest. I'll update if I can find a resolution. For analysis_modules it should be okay, as if you access them via amods it should already know the location from which to import them.
-Matt
On Fri, Jan 13, 2012 at 2:29 PM, Stephen Skory <s@skory.us> wrote:
Hi Matt,
Britton or Stephen, this sounds like it's directly up your alley as you run on Kraken the most often.
I can give, but wouldn't this help on all computers, or am I misunderstanding things?
from yt.pmods import *
Ideas?
What about analysis modules that import stuff, too? Like halo finding. I guess I have to read more about this, so that's more of a rhetorical question.
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Matt, That addition is working for me on Kraken. I'm running this test for a number of cores and will report soon. Britton On Sun, Jan 15, 2012 at 6:53 AM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi again,
I've written to the package maintainer, but if you insert this line:
while fqname[-1] == ".": fqname = fqname[:-1]
at the top of the function definition __import__module__ inside MPI_Import.py, it should work for importing yt.mods. This script works great for me:
from MPI_Import import mpi_import with mpi_import(): from yt.mods import * mylog.info("Hello")
I think it's ready to test on Kraken/Janus/etc.
-Matt
On Fri, Jan 13, 2012 at 2:31 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,
I was able to test it and find a problem with mutually-recursive imports in unittest. I'll update if I can find a resolution. For analysis_modules it should be okay, as if you access them via amods it should already know the location from which to import them.
-Matt
On Fri, Jan 13, 2012 at 2:29 PM, Stephen Skory <s@skory.us> wrote:
Hi Matt,
Britton or Stephen, this sounds like it's directly up your alley as you run on Kraken the most often.
I can give, but wouldn't this help on all computers, or am I misunderstanding things?
from yt.pmods import *
Ideas?
What about analysis modules that import stuff, too? Like halo finding. I guess I have to read more about this, so that's more of a rhetorical question.
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Everyone, I just tested this new import method with Matt's fix on Kraken against the standard import for a variety of core counts. In the attached figure, I plot the mean time to import, with errorbars showing the minimum and maximum time for all cores. I think the plot speaks for itself. Britton On Sun, Jan 15, 2012 at 10:04 AM, Britton Smith <brittonsmith@gmail.com>wrote:
Hi Matt,
That addition is working for me on Kraken. I'm running this test for a number of cores and will report soon.
Britton
On Sun, Jan 15, 2012 at 6:53 AM, Matthew Turk <matthewturk@gmail.com>wrote:
Hi again,
I've written to the package maintainer, but if you insert this line:
while fqname[-1] == ".": fqname = fqname[:-1]
at the top of the function definition __import__module__ inside MPI_Import.py, it should work for importing yt.mods. This script works great for me:
from MPI_Import import mpi_import with mpi_import(): from yt.mods import * mylog.info("Hello")
I think it's ready to test on Kraken/Janus/etc.
-Matt
On Fri, Jan 13, 2012 at 2:31 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,
I was able to test it and find a problem with mutually-recursive imports in unittest. I'll update if I can find a resolution. For analysis_modules it should be okay, as if you access them via amods it should already know the location from which to import them.
-Matt
On Fri, Jan 13, 2012 at 2:29 PM, Stephen Skory <s@skory.us> wrote:
Hi Matt,
Britton or Stephen, this sounds like it's directly up your alley as you run on Kraken the most often.
I can give, but wouldn't this help on all computers, or am I misunderstanding things?
from yt.pmods import *
Ideas?
What about analysis modules that import stuff, too? Like halo finding. I guess I have to read more about this, so that's more of a rhetorical question.
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Britton & Matt,
I just tested this new import method with Matt's fix on Kraken against the standard import for a variety of core counts. In the attached figure, I plot the mean time to import, with errorbars showing the minimum and maximum time for all cores. I think the plot speaks for itself.
I just tried on Janus, and while the new import is faster, it's not quite as impressive as what Britton sees, it's definitely not slower. At 1024 threads it went from 18 sec to 16. But clearly that disk is much nicer to work with than Kraken's lustre. So, I figure we should use this! -- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice)
Matt, Does the fix you posted for this have any negative side-effects that would prevent us from officially adopting this as a solution? If not, should we try to integrate this into yt or update some documentation so people know that they should be using this? Britton On Sun, Jan 15, 2012 at 1:01 PM, Stephen Skory <s@skory.us> wrote:
Hi Britton & Matt,
I just tested this new import method with Matt's fix on Kraken against the standard import for a variety of core counts. In the attached figure, I plot the mean time to import, with errorbars showing the minimum and maximum time for all cores. I think the plot speaks for itself.
I just tried on Janus, and while the new import is faster, it's not quite as impressive as what Britton sees, it's definitely not slower. At 1024 threads it went from 18 sec to 16. But clearly that disk is much nicer to work with than Kraken's lustre. So, I figure we should use this!
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Britton, My fix has no sideeffects. I think it is safe to apply. I spoke to Lisandro Dalcin from mpi4py today and he said it is possible that this will be included in mpi4py proper at some point, so once that happens we should deprecated ours. I am of the opinion we can do this with an alternate, parallel import that would be compatible with yt.mods. Something like yt.pmods. What do you think? What should usage of this be? Matt On Jan 16, 2012 10:48 AM, "Britton Smith" <brittonsmith@gmail.com> wrote:
Matt,
Does the fix you posted for this have any negative side-effects that would prevent us from officially adopting this as a solution? If not, should we try to integrate this into yt or update some documentation so people know that they should be using this?
Britton
On Sun, Jan 15, 2012 at 1:01 PM, Stephen Skory <s@skory.us> wrote:
Hi Britton & Matt,
I just tested this new import method with Matt's fix on Kraken against the standard import for a variety of core counts. In the attached figure, I plot the mean time to import, with errorbars showing the minimum and maximum time for all cores. I think the plot speaks for itself.
I just tried on Janus, and while the new import is faster, it's not quite as impressive as what Britton sees, it's definitely not slower. At 1024 threads it went from 18 sec to 16. But clearly that disk is much nicer to work with than Kraken's lustre. So, I figure we should use this!
-- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice) _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Matt,
I am of the opinion we can do this with an alternate, parallel import that would be compatible with yt.mods. Something like yt.pmods. What do you think? What should usage of this be?
I disagree that there should be a yt.pmods. I think we should try to keep it such that a script that works in serial works in parallel. Obviously, yt.pmods could be made to fall back to serial mode, but I mean that scripts shouldn't have to be modified, if at all possible, to go from serial to parallel. -- Stephen Skory s@skory.us http://stephenskory.com/ 510.621.3687 (google voice)
I am of the opinion we can do this with an alternate, parallel import that would be compatible with yt.mods. Something like yt.pmods. What do you think? What should usage of this be?
Not sure I understand this. How is yt.pmods compatible with yt.mods? Do you mean both would have the same effect, but yt.pmods would use the new mechanism for loading rather than the current standard one which would be retained by yt.mods? If so, that sounds like an OK idea to me, though if there is no side effect, it could lead to people forgetting to sub in pmods, and then being stuck with bad, old performance. Given that this will be most important where even a single forgetful job could cost substantial allocation usage, maybe we should think about making it automatic? Or perhaps I am misunderstanding...
Hi all, Okay, seems like there is some confusion. My response was in reference to Britton's question, which I thought was "are there any sideeffects of [your fix for recursive imports] on the operation [of the import hack for MPI]?" There are not. My original statement, which Stephen disagrees with, is that we should require an explicit change on the part of the user before we (on their behalf) fundamentally modify the way the base functionality of 'import' works for all Python modules. I have several motivations for this: * The import problem is generic for shared filesystems being accessed in parallel, but is only crippling at relatively large core counts on particularly large lustre systems, compared to what most users utilize. This is a -1 for global application of the MPI_Import fix. * The change of yt.mods to yt.pmods is not an invasive change, although I too do not like having different behavior for running in parallel. However, we do expect a number of things from users that run in parallel: an understanding of the resources they are to allocate, a set up of the queue script, and a recognition of which activities will parallelize and which will not. I still think it is not the best solution, but I believe it is non-invasive. * Every time we add on an additional non-sanitized import, we take a big performance hit. It is in our best interest for this to all occur at the outermost level. * Detecting whether we are running in parallel is not trivial. * On some machines, specifically SGI, if you run a script in parallel that contains a try/except block for importing MPI and it was not launched with MPI, it will die unceremoniously. We cannot rely on a try/except of importing MPI. All these things combined lead me to believe that I think we should not attempt to guess *for* the user My proposed change, of adding yt.pmods, would consist of a new file (yt/pmods.py) that contains the full contents of MPI_Import.py and that, at the end, performs this operation: with mpi_import(): from yt.mods import * What this would result in is a nearly self-contained script that returned to the user the contents of yt.mods; there'd be no duplication. An alternate solution, which I am not terribly keen on, would be to put manual context startup/shutdown inside yt.mods if startup_tasks.parallel_enabled is true. I feel like this would result in a lot of unnecessary side effects. However, with yt.pmods, while the yt imports would all be included, any additional imports would still need sanitation inside the users' script. The fallback would be to require users to use the with: statement themselves. -Matt On Mon, Jan 16, 2012 at 11:08 AM, j s oishi <jsoishi@gmail.com> wrote:
I am of the opinion we can do this with an alternate, parallel import that would be compatible with yt.mods. Something like yt.pmods. What do you think? What should usage of this be?
Not sure I understand this. How is yt.pmods compatible with yt.mods? Do you mean both would have the same effect, but yt.pmods would use the new mechanism for loading rather than the current standard one which would be retained by yt.mods? If so, that sounds like an OK idea to me, though if there is no side effect, it could lead to people forgetting to sub in pmods, and then being stuck with bad, old performance. Given that this will be most important where even a single forgetful job could cost substantial allocation usage, maybe we should think about making it automatic?
Or perhaps I am misunderstanding... _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Matt is right about the perils of putting any mpi imports in a try block. Systems like Ranger will fail in a way that is not catchable by python when trying to import mpi4py and not running in parallel. I think the yt.pmods solution is the best for now. Since the imports problem really only gets serious for more than about 100 cores, I think it's ok to impose some additional requirements of understanding on the user if they're going to run jobs that large. Would it be possible to add some sort of helper function such that you could do something like the following in a script: from yt.pmods import * parallel_import("from my_analysis import *") That be helpful. Britton On Mon, Jan 16, 2012 at 1:20 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,
Okay, seems like there is some confusion. My response was in reference to Britton's question, which I thought was "are there any sideeffects of [your fix for recursive imports] on the operation [of the import hack for MPI]?" There are not.
My original statement, which Stephen disagrees with, is that we should require an explicit change on the part of the user before we (on their behalf) fundamentally modify the way the base functionality of 'import' works for all Python modules. I have several motivations for this:
* The import problem is generic for shared filesystems being accessed in parallel, but is only crippling at relatively large core counts on particularly large lustre systems, compared to what most users utilize. This is a -1 for global application of the MPI_Import fix. * The change of yt.mods to yt.pmods is not an invasive change, although I too do not like having different behavior for running in parallel. However, we do expect a number of things from users that run in parallel: an understanding of the resources they are to allocate, a set up of the queue script, and a recognition of which activities will parallelize and which will not. I still think it is not the best solution, but I believe it is non-invasive. * Every time we add on an additional non-sanitized import, we take a big performance hit. It is in our best interest for this to all occur at the outermost level. * Detecting whether we are running in parallel is not trivial. * On some machines, specifically SGI, if you run a script in parallel that contains a try/except block for importing MPI and it was not launched with MPI, it will die unceremoniously. We cannot rely on a try/except of importing MPI.
All these things combined lead me to believe that I think we should not attempt to guess *for* the user
My proposed change, of adding yt.pmods, would consist of a new file (yt/pmods.py) that contains the full contents of MPI_Import.py and that, at the end, performs this operation:
with mpi_import(): from yt.mods import *
What this would result in is a nearly self-contained script that returned to the user the contents of yt.mods; there'd be no duplication. An alternate solution, which I am not terribly keen on, would be to put manual context startup/shutdown inside yt.mods if startup_tasks.parallel_enabled is true. I feel like this would result in a lot of unnecessary side effects. However, with yt.pmods, while the yt imports would all be included, any additional imports would still need sanitation inside the users' script.
The fallback would be to require users to use the with: statement themselves.
-Matt
I am of the opinion we can do this with an alternate, parallel import
On Mon, Jan 16, 2012 at 11:08 AM, j s oishi <jsoishi@gmail.com> wrote: that
would be compatible with yt.mods. Something like yt.pmods. What do you think? What should usage of this be?
Not sure I understand this. How is yt.pmods compatible with yt.mods? Do you mean both would have the same effect, but yt.pmods would use the new mechanism for loading rather than the current standard one which would be retained by yt.mods? If so, that sounds like an OK idea to me, though if there is no side effect, it could lead to people forgetting to sub in pmods, and then being stuck with bad, old performance. Given that this will be most important where even a single forgetful job could cost substantial allocation usage, maybe we should think about making it automatic?
Or perhaps I am misunderstanding... _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Hi Britton, As side effect of yt.pmods being imported is that you'll get the context manager, so you can then still do: from yt.pmods import * with mpi_import(): from my_analysis import * -Matt On Mon, Jan 16, 2012 at 1:31 PM, Britton Smith <brittonsmith@gmail.com> wrote:
Matt is right about the perils of putting any mpi imports in a try block. Systems like Ranger will fail in a way that is not catchable by python when trying to import mpi4py and not running in parallel. I think the yt.pmods solution is the best for now. Since the imports problem really only gets serious for more than about 100 cores, I think it's ok to impose some additional requirements of understanding on the user if they're going to run jobs that large. Would it be possible to add some sort of helper function such that you could do something like the following in a script: from yt.pmods import * parallel_import("from my_analysis import *")
That be helpful. Britton
On Mon, Jan 16, 2012 at 1:20 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,
Okay, seems like there is some confusion. My response was in reference to Britton's question, which I thought was "are there any sideeffects of [your fix for recursive imports] on the operation [of the import hack for MPI]?" There are not.
My original statement, which Stephen disagrees with, is that we should require an explicit change on the part of the user before we (on their behalf) fundamentally modify the way the base functionality of 'import' works for all Python modules. I have several motivations for this:
* The import problem is generic for shared filesystems being accessed in parallel, but is only crippling at relatively large core counts on particularly large lustre systems, compared to what most users utilize. This is a -1 for global application of the MPI_Import fix. * The change of yt.mods to yt.pmods is not an invasive change, although I too do not like having different behavior for running in parallel. However, we do expect a number of things from users that run in parallel: an understanding of the resources they are to allocate, a set up of the queue script, and a recognition of which activities will parallelize and which will not. I still think it is not the best solution, but I believe it is non-invasive. * Every time we add on an additional non-sanitized import, we take a big performance hit. It is in our best interest for this to all occur at the outermost level. * Detecting whether we are running in parallel is not trivial. * On some machines, specifically SGI, if you run a script in parallel that contains a try/except block for importing MPI and it was not launched with MPI, it will die unceremoniously. We cannot rely on a try/except of importing MPI.
All these things combined lead me to believe that I think we should not attempt to guess *for* the user
My proposed change, of adding yt.pmods, would consist of a new file (yt/pmods.py) that contains the full contents of MPI_Import.py and that, at the end, performs this operation:
with mpi_import(): from yt.mods import *
What this would result in is a nearly self-contained script that returned to the user the contents of yt.mods; there'd be no duplication. An alternate solution, which I am not terribly keen on, would be to put manual context startup/shutdown inside yt.mods if startup_tasks.parallel_enabled is true. I feel like this would result in a lot of unnecessary side effects. However, with yt.pmods, while the yt imports would all be included, any additional imports would still need sanitation inside the users' script.
The fallback would be to require users to use the with: statement themselves.
-Matt
On Mon, Jan 16, 2012 at 11:08 AM, j s oishi <jsoishi@gmail.com> wrote:
I am of the opinion we can do this with an alternate, parallel import that would be compatible with yt.mods. Something like yt.pmods. What do you think? What should usage of this be?
Not sure I understand this. How is yt.pmods compatible with yt.mods? Do you mean both would have the same effect, but yt.pmods would use the new mechanism for loading rather than the current standard one which would be retained by yt.mods? If so, that sounds like an OK idea to me, though if there is no side effect, it could lead to people forgetting to sub in pmods, and then being stuck with bad, old performance. Given that this will be most important where even a single forgetful job could cost substantial allocation usage, maybe we should think about making it automatic?
Or perhaps I am misunderstanding... _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
I think that's good enough for me, then. Should we wait until the fix you provided is accepted into the repo for this or just go ahead? Britton On Mon, Jan 16, 2012 at 1:33 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Britton,
As side effect of yt.pmods being imported is that you'll get the context manager, so you can then still do:
from yt.pmods import * with mpi_import(): from my_analysis import *
-Matt
On Mon, Jan 16, 2012 at 1:31 PM, Britton Smith <brittonsmith@gmail.com> wrote:
Matt is right about the perils of putting any mpi imports in a try block. Systems like Ranger will fail in a way that is not catchable by python when trying to import mpi4py and not running in parallel. I think the yt.pmods solution is the best for now. Since the imports problem really only gets serious for more than about 100 cores, I think it's ok to impose some additional requirements of understanding on the user if they're going to run jobs that large. Would it be possible to add some sort of helper function such that you could do something like the following in a script: from yt.pmods import * parallel_import("from my_analysis import *")
That be helpful. Britton
On Mon, Jan 16, 2012 at 1:20 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,
Okay, seems like there is some confusion. My response was in reference to Britton's question, which I thought was "are there any sideeffects of [your fix for recursive imports] on the operation [of the import hack for MPI]?" There are not.
My original statement, which Stephen disagrees with, is that we should require an explicit change on the part of the user before we (on their behalf) fundamentally modify the way the base functionality of 'import' works for all Python modules. I have several motivations for this:
* The import problem is generic for shared filesystems being accessed in parallel, but is only crippling at relatively large core counts on particularly large lustre systems, compared to what most users utilize. This is a -1 for global application of the MPI_Import fix. * The change of yt.mods to yt.pmods is not an invasive change, although I too do not like having different behavior for running in parallel. However, we do expect a number of things from users that run in parallel: an understanding of the resources they are to allocate, a set up of the queue script, and a recognition of which activities will parallelize and which will not. I still think it is not the best solution, but I believe it is non-invasive. * Every time we add on an additional non-sanitized import, we take a big performance hit. It is in our best interest for this to all occur at the outermost level. * Detecting whether we are running in parallel is not trivial. * On some machines, specifically SGI, if you run a script in parallel that contains a try/except block for importing MPI and it was not launched with MPI, it will die unceremoniously. We cannot rely on a try/except of importing MPI.
All these things combined lead me to believe that I think we should not attempt to guess *for* the user
My proposed change, of adding yt.pmods, would consist of a new file (yt/pmods.py) that contains the full contents of MPI_Import.py and that, at the end, performs this operation:
with mpi_import(): from yt.mods import *
What this would result in is a nearly self-contained script that returned to the user the contents of yt.mods; there'd be no duplication. An alternate solution, which I am not terribly keen on, would be to put manual context startup/shutdown inside yt.mods if startup_tasks.parallel_enabled is true. I feel like this would result in a lot of unnecessary side effects. However, with yt.pmods, while the yt imports would all be included, any additional imports would still need sanitation inside the users' script.
The fallback would be to require users to use the with: statement themselves.
-Matt
On Mon, Jan 16, 2012 at 11:08 AM, j s oishi <jsoishi@gmail.com> wrote:
I am of the opinion we can do this with an alternate, parallel import that would be compatible with yt.mods. Something like yt.pmods. What do
you
think? What should usage of this be?
Not sure I understand this. How is yt.pmods compatible with yt.mods? Do you mean both would have the same effect, but yt.pmods would use the new mechanism for loading rather than the current standard one which would be retained by yt.mods? If so, that sounds like an OK idea to me, though if there is no side effect, it could lead to people forgetting to sub in pmods, and then being stuck with bad, old performance. Given that this will be most important where even a single forgetful job could cost substantial allocation usage, maybe we should think about making it automatic?
Or perhaps I am misunderstanding... _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
I have issued a pull request. However, because there is still some dissent, I suggest waiting for Stephen and Jeff to sign off before accepting. Stephen and Jeff, I think it's fair to bring up concerns/disagreements here but to sign off in the PR if you are okay with it. https://bitbucket.org/yt_analysis/yt/pull-request/55/add-mpi_import-to-help-... -Matt On Mon, Jan 16, 2012 at 1:46 PM, Britton Smith <brittonsmith@gmail.com> wrote:
I think that's good enough for me, then. Should we wait until the fix you provided is accepted into the repo for this or just go ahead?
Britton
On Mon, Jan 16, 2012 at 1:33 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi Britton,
As side effect of yt.pmods being imported is that you'll get the context manager, so you can then still do:
from yt.pmods import * with mpi_import(): from my_analysis import *
-Matt
On Mon, Jan 16, 2012 at 1:31 PM, Britton Smith <brittonsmith@gmail.com> wrote:
Matt is right about the perils of putting any mpi imports in a try block. Systems like Ranger will fail in a way that is not catchable by python when trying to import mpi4py and not running in parallel. I think the yt.pmods solution is the best for now. Since the imports problem really only gets serious for more than about 100 cores, I think it's ok to impose some additional requirements of understanding on the user if they're going to run jobs that large. Would it be possible to add some sort of helper function such that you could do something like the following in a script: from yt.pmods import * parallel_import("from my_analysis import *")
That be helpful. Britton
On Mon, Jan 16, 2012 at 1:20 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,
Okay, seems like there is some confusion. My response was in reference to Britton's question, which I thought was "are there any sideeffects of [your fix for recursive imports] on the operation [of the import hack for MPI]?" There are not.
My original statement, which Stephen disagrees with, is that we should require an explicit change on the part of the user before we (on their behalf) fundamentally modify the way the base functionality of 'import' works for all Python modules. I have several motivations for this:
* The import problem is generic for shared filesystems being accessed in parallel, but is only crippling at relatively large core counts on particularly large lustre systems, compared to what most users utilize. This is a -1 for global application of the MPI_Import fix. * The change of yt.mods to yt.pmods is not an invasive change, although I too do not like having different behavior for running in parallel. However, we do expect a number of things from users that run in parallel: an understanding of the resources they are to allocate, a set up of the queue script, and a recognition of which activities will parallelize and which will not. I still think it is not the best solution, but I believe it is non-invasive. * Every time we add on an additional non-sanitized import, we take a big performance hit. It is in our best interest for this to all occur at the outermost level. * Detecting whether we are running in parallel is not trivial. * On some machines, specifically SGI, if you run a script in parallel that contains a try/except block for importing MPI and it was not launched with MPI, it will die unceremoniously. We cannot rely on a try/except of importing MPI.
All these things combined lead me to believe that I think we should not attempt to guess *for* the user
My proposed change, of adding yt.pmods, would consist of a new file (yt/pmods.py) that contains the full contents of MPI_Import.py and that, at the end, performs this operation:
with mpi_import(): from yt.mods import *
What this would result in is a nearly self-contained script that returned to the user the contents of yt.mods; there'd be no duplication. An alternate solution, which I am not terribly keen on, would be to put manual context startup/shutdown inside yt.mods if startup_tasks.parallel_enabled is true. I feel like this would result in a lot of unnecessary side effects. However, with yt.pmods, while the yt imports would all be included, any additional imports would still need sanitation inside the users' script.
The fallback would be to require users to use the with: statement themselves.
-Matt
On Mon, Jan 16, 2012 at 11:08 AM, j s oishi <jsoishi@gmail.com> wrote:
I am of the opinion we can do this with an alternate, parallel import that would be compatible with yt.mods. Something like yt.pmods. What do you think? What should usage of this be?
Not sure I understand this. How is yt.pmods compatible with yt.mods? Do you mean both would have the same effect, but yt.pmods would use the new mechanism for loading rather than the current standard one which would be retained by yt.mods? If so, that sounds like an OK idea to me, though if there is no side effect, it could lead to people forgetting to sub in pmods, and then being stuck with bad, old performance. Given that this will be most important where even a single forgetful job could cost substantial allocation usage, maybe we should think about making it automatic?
Or perhaps I am misunderstanding... _______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
_______________________________________________ yt-dev mailing list yt-dev@lists.spacepope.org http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
participants (4)
-
Britton Smith
-
j s oishi
-
Matthew Turk
-
Stephen Skory