Installing packages using pip
Hello, I want to install the following version of pygame in my windows10.pro:install pygame-1.9.2a0-cp33-none-win_amd64.whl, but the file is in *.whl* format which I dont have any program to open it I checked the information in https://docs.python.org/3/installing/ to understand how to use pip. But didn't make it... My result is the following, using *python3.5.0 IDLE*:
pip *install* pygame-1.9.2a0-cp34-none-win32.whl SyntaxError: invalid syntax pip *pygame*-1.9.2a0-cp34-none-win32.whl SyntaxError: invalid syntax python -m *pip* install pygame-1.9.2a0-cp33-none-win_amd64.whl SyntaxError: invalid syntax python3.*5*.0 -m pip install pygame-1.9.2a0-cp33-none-win_amd64.whl SyntaxError: invalid syntax
Can you help me solve my problem?
On 6 November 2015 at 14:33, Ines Barata
Hello,
I want to install the following version of pygame in my windows10.pro:install pygame-1.9.2a0-cp33-none-win_amd64.whl, but the file is in .whl format which I dont have any program to open it I checked the information in https://docs.python.org/3/installing/ to understand how to use pip. But didn't make it...
My result is the following, using python3.5.0 IDLE:
pip install pygame-1.9.2a0-cp34-none-win32.whl SyntaxError: invalid syntax
That's the correct command, but you need to run it from the Windows command prompt, not from within IDLE. Paul
On Fri, Nov 6, 2015 at 8:06 AM, Paul Moore
That's the correct command, but you need to run it from the Windows command prompt, not from within IDLE.
Now that we are talking about how to invoke the installer on other threads... This is NOT the least bit a rare mistake for newbies. Maybe we should have a way to install right from inside the python REPL. That would certainly clear up the "which python is this going to get installed into" problem. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
import pip pip.install(PACKAGESPEC) something like that? On 11/13/2015 12:42, Chris Barker wrote:
On Fri, Nov 6, 2015 at 8:06 AM, Paul Moore
mailto:p.f.moore@gmail.com> wrote: That's the correct command, but you need to run it from the Windows command prompt, not from within IDLE.
Now that we are talking about how to invoke the installer on other threads...
This is NOT the least bit a rare mistake for newbies. Maybe we should have a way to install right from inside the python REPL.
That would certainly clear up the "which python is this going to get installed into" problem.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov mailto:Chris.Barker@noaa.gov
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On Nov 13, 2015 12:00 PM, "Alexander Walters"
import pip pip.install(PACKAGESPEC)
something like that?
This would be extremely handy if it could be made to work reliably... But I'm skeptical about whether it can be made to work reliably. Consider all the fun things that could happen once you start upgrading packages while python is running, and might e.g. have half of an upgraded package already loaded into memory. It's like the reloading problem but even more so. -n
On Fri, 13 Nov 2015 12:09:28 -0800, Nathaniel Smith
On Nov 13, 2015 12:00 PM, "Alexander Walters"
wrote: import pip pip.install(PACKAGESPEC)
something like that?
This would be extremely handy if it could be made to work reliably... But I'm skeptical about whether it can be made to work reliably. Consider all the fun things that could happen once you start upgrading packages while python is running, and might e.g. have half of an upgraded package already loaded into memory. It's like the reloading problem but even more so.
If I remember correctly, this is something that R supports that I thought was cool when I saw it. We could have a command analogous to the 'help' command, so you wouldn't even have to do an explicit import. But yeah, making it work may be hard. --David
On Fri, Nov 13, 2015 at 12:09 PM, Nathaniel Smith
On Nov 13, 2015 12:00 PM, "Alexander Walters"
wrote: import pip pip.install(PACKAGESPEC)
something like that?
This would be extremely handy if it could be made to work reliably... But I'm skeptical about whether it can be made to work reliably. Consider all the fun things that could happen once you start upgrading packages while python is running, and might e.g. have half of an upgraded package already loaded into memory. It's like the reloading problem but even more so.
indeed -- does seem risky. also, if were are in fantasy land, and want to be really newbie friendly, a new built in: pip.install(PACKAGESPEC) with no import required.... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
While I like the concept of calling pip via an api (and let me pat myself on the back for suggesting it in the first place in this thread), I honestly think that if it is something that is allowed, it should be implemented with a fair bit of guards. It would probably end up being a power-user feature - something to help manage deployments in some tricky environments - than a newbie feature. Python is an IDE-less language, and I say this knowing full well what IDLE is. We don't default to eclipse like java does, or Visual Studio like .NET languages (and C(++) on windows). We do not have the default tooling in place to avoid using the command line. Learning the command line is a vital skill for newbies. Now, while this thread may or may not be about Windows newbies specifically, I do not tend to see this brought up for *nix newbies. Is this because we assume that a *nix user will have to know the command line? or that they are inherently power users? If it is the latter, then I need to say that being a programmer also means being a power user. We should guide new users to power user tools (the command line, powershell, etc), instead of trying to bend python to regular users who will eventually be power users anyways. I guess I am suggesting maybe we try and find a way to shallow the learning curve into using the command line than to just implement commands in the repl itself. all that said, IDLE could be tooled to intercept the syntax 'pip install foo' and print a more helpful message. On 11/13/2015 15:27, Chris Barker wrote:
On Fri, Nov 13, 2015 at 12:09 PM, Nathaniel Smith
mailto:njs@pobox.com> wrote: On Nov 13, 2015 12:00 PM, "Alexander Walters"
mailto:tritium-list@sdamon.com> wrote: > > import pip > pip.install(PACKAGESPEC) > > something like that? This would be extremely handy if it could be made to work reliably... But I'm skeptical about whether it can be made to work reliably. Consider all the fun things that could happen once you start upgrading packages while python is running, and might e.g. have half of an upgraded package already loaded into memory. It's like the reloading problem but even more so.
indeed -- does seem risky.
also, if were are in fantasy land, and want to be really newbie friendly, a new built in:
pip.install(PACKAGESPEC)
with no import required....
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov mailto:Chris.Barker@noaa.gov
On Nov 13, 2015 3:07 PM, "R. David Murray"
On Fri, 13 Nov 2015 12:09:28 -0800, Nathaniel Smith
wrote: On Nov 13, 2015 12:00 PM, "Alexander Walters"
wrote: import pip pip.install(PACKAGESPEC)
something like that?
This would be extremely handy if it could be made to work reliably...
But
I'm skeptical about whether it can be made to work reliably. Consider all the fun things that could happen once you start upgrading packages while python is running, and might e.g. have half of an upgraded package already loaded into memory. It's like the reloading problem but even more so.
If I remember correctly, this is something that R supports that I thought was cool when I saw it. We could have a command analogous to the 'help' command, so you wouldn't even have to do an explicit import. But yeah, making it work may be hard.
Yeah, I've long used this in R and it really is awesome -- I wasn't kidding in the first sentence I wrote above :-). It leads to a really short frustration cycle:
import somepkg error install("somepkg") installing...done. import somepkg :-)
But details of R's execution model make this easier to do. Maybe it could be supported for the special case of installing new packages with no upgrades? A good way to environment with the possibilities would be to write a %pip magic for ipython: http://ipython.readthedocs.org/en/stable/interactive/tutorial.html#magic-fun... http://ipython.readthedocs.org/en/stable/config/custommagics.html -n
On 13 November 2015 at 23:38, Nathaniel Smith
But details of R's execution model make this easier to do.
Indeed. I don't know how R works, but Python's module caching behaviour would mean this would be full of surprising and confusing corner cases ("I upgraded but I'm still getting the old version" being the simplest and most obvious one).
Maybe it could be supported for the special case of installing new packages with no upgrades?
Possibly. But the rules on what is allowed would likely be fairly complex and hard to understand.
A good way to environment with the possibilities would be to write a %pip magic for ipython:
Equally, if you want to see how well the model works, you can just start up a Python interpreter session and when you want to install something, do so in a separate command window. All of the issues I can think of are basically a result of not restarting Python after installing a new package, so you'd probably see most of them like that. Conversely, if IPython has a "restart the kernel" command, then I see no reason why a %pip magic wouldn't be fine, as long as you restart the kernel after each (series of) %pip commands. The same with Idle, if there's a "restart the interpreter" option, that would be safe. Of course this doesn't solve the issue of "I want to keep my work in progress" but the fact that you can't is an easier restriction to explain than "only when installing new packages where none of the package install nor any of its dependencies triggers an upgrade"... Paul
On 14 Nov 2015 11:12, "Paul Moore"
On 13 November 2015 at 23:38, Nathaniel Smith
wrote: But details of R's execution model make this easier to do.
Indeed. I don't know how R works, but Python's module caching behaviour would mean this would be full of surprising and confusing corner cases ("I upgraded but I'm still getting the old version" being the simplest and most obvious one).
Maybe it could be supported for the special case of installing new
packages with no upgrades Maybe it could prompt the user that the interpreter will need to be restarted for the changes to take effect. IDLE runs the interactive interpreter in a separate process so it could restart the subprocess without closing the GUI (after prompting the user with a restart/continue dialogue). I'm not sure if the standard interpreter would be able to relaunch itself but it could at least exit and tell the user to restart (after a yes/no question in the terminal). The command could also be limited to the when the interpreter is in interactive mode. How it works in the terminal is less important to me than how it works in IDLE though; being able to teach how to use Python through IDLE (deferring discussion of terminals etc) is useful for introductory programming classes. -- Oscar
I perhaps can support added dialogs to IDLE to manage packages (having it shell out to pip, if no api is forthcoming), but I don't think I can support having the repl inside of IDLE intercept pip's command line syntax and do anything OTHER than giving a better error message. On 11/14/2015 06:37, Oscar Benjamin wrote:
On 14 Nov 2015 11:12, "Paul Moore"
mailto:p.f.moore@gmail.com> wrote: On 13 November 2015 at 23:38, Nathaniel Smith
But details of R's execution model make this easier to do.
Indeed. I don't know how R works, but Python's module caching behaviour would mean this would be full of surprising and confusing corner cases ("I upgraded but I'm still getting the old version" being the simplest and most obvious one).
Maybe it could be supported for the special case of installing new
mailto:njs@pobox.com> wrote: packages with no upgrades
Maybe it could prompt the user that the interpreter will need to be restarted for the changes to take effect. IDLE runs the interactive interpreter in a separate process so it could restart the subprocess without closing the GUI (after prompting the user with a restart/continue dialogue).
I'm not sure if the standard interpreter would be able to relaunch itself but it could at least exit and tell the user to restart (after a yes/no question in the terminal). The command could also be limited to the when the interpreter is in interactive mode.
How it works in the terminal is less important to me than how it works in IDLE though; being able to teach how to use Python through IDLE (deferring discussion of terminals etc) is useful for introductory programming classes.
-- Oscar
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On Sat, 14 Nov 2015 13:48:51 -0500, Alexander Walters
I perhaps can support added dialogs to IDLE to manage packages (having it shell out to pip, if no api is forthcoming), but I don't think I can support having the repl inside of IDLE intercept pip's command line syntax and do anything OTHER than giving a better error message.
How it works in the terminal is less important to me than how it works in IDLE though; being able to teach how to use Python through IDLE (deferring discussion of terminals etc) is useful for introductory programming classes. Personally, I don't use IDLE for teaching, but do use iPython. But if we have a way to call pip from a Python REPL, it really should work in the standard REPL. Though still a good idea to have IDLE specific and iPython specific ways to install packages within those environments. I like the %pip Idea for iPython -- and I'm pretty sure the kernel can be restarted. Certainly in a notebook. As for the plain REPL, maybe a warning that you need to restart after an upgrade would be enough. Though I suspect that Window's aggressive file locking will put the kibosh on in-place upgrades :) CHB -- Oscar _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On 15 November 2015 at 20:25, Chris Barker - NOAA Federal
Though I suspect that Window's aggressive file locking will put the kibosh on in-place upgrades :)
Generally, no. Python loads pyc files with a single read, it doesn't leave the files open. The only locking issues are when you try to upgrade a wrapper exe while it's in use. That might affect you if you try to upgrade IPython from within IPython, but otherwise it's probably fine. Paul
But I think dll/pyd files from extension modules present more of a challenge, since they're left open. I recall some issues around this with conda (e.g. https://github.com/conda/conda-build/pull/520) -Robert On Sun, Nov 15, 2015 at 12:31 PM, Paul Moore
On 15 November 2015 at 20:25, Chris Barker - NOAA Federal
wrote: Though I suspect that Window's aggressive file locking will put the kibosh on in-place upgrades :)
Generally, no. Python loads pyc files with a single read, it doesn't leave the files open. The only locking issues are when you try to upgrade a wrapper exe while it's in use. That might affect you if you try to upgrade IPython from within IPython, but otherwise it's probably fine.
Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig
On Nov 15, 2015 5:11 PM, "Paul Moore"
On 15 November 2015 at 22:20, Robert McGibbon
wrote: But I think dll/pyd files from extension modules present more of a challenge, since they're left open.
Good point, I'd forgotten about those. Yes, they would cause an upgrade to fail. Sorry.
Windows file locking is the worst. But it *is* possible to get around - see the program called BareTail. I was actually looking around at one point because I wanted to make a cross platform version of it (it really is the best tail application I've ever seen), and I think I came across a way to do it on StackOverflow, but I think it required doing some low level open magic and maybe required the win32api? Anyway, I kind of gave up on it, but that's the sort of thing that I would love to see in core python. But I don't know that it would get there without a motivated individual. -W
On 16 November 2015 at 13:12, Wayne Werner
Windows file locking is the worst. But it *is* possible to get around - see the program called BareTail.
Windows file locking is just complex, and the defaults are "safe" (i.e. people can't change a file you have open). As you say, you can use different locking options with the Win32 API, but because core Python's file API is basically derived from Unix, it doesn't expose these options. It wouldn't help here anyway, as it's Windows that locks a DLL when you load it. Personally, I don't see why people think that's a bad thing - who wants someone to modify the code they are using behind their back? (I don't know what Unix does, I suspect it retains an old copy of the shared library for the process until the process exists, in which case you'd see a different issue, that you do an upgrade, but your process still uses the old code till you restart). Long story short, modifying code that a process is using is a bad thing to do. Therefore, the fact that pip can't easily do it is probably a good thing in reality... Paul
On Mon, 16 Nov 2015, Paul Moore wrote:
On 16 November 2015 at 13:12, Wayne Werner
wrote: Windows file locking is the worst. But it *is* possible to get around - see the program called BareTail.
Windows file locking is just complex, and the defaults are "safe" (i.e. people can't change a file you have open). As you say, you can use different locking options with the Win32 API, but because core Python's file API is basically derived from Unix, it doesn't expose these options.
It wouldn't help here anyway, as it's Windows that locks a DLL when you load it. Personally, I don't see why people think that's a bad thing - who wants someone to modify the code they are using behind their back? (I don't know what Unix does, I suspect it retains an old copy of the shared library for the process until the process exists, in which case you'd see a different issue, that you do an upgrade, but your process still uses the old code till you restart).
This is the case for all files. To check, simply open two terminals and in one: echo "Going away" >> ~/test.txt tail ~/test.txt And in the other: echo "You will not see this" > ~/test.txt But tail has an option `-f` that will follow the file by name, instead of I presume it't the inode. Of course if you >> append to the file instead, tail *will* actually pick that up.
Long story short, modifying code that a process is using is a bad thing to do. Therefore, the fact that pip can't easily do it is probably a good thing in reality...
I suspect it makes life simple (which is better than complex). My personal assumption about DLL loading would be that it would follow the same pattern as Python importing modules - it's loaded once from disk at the first time it's imported, and it never goes back "to disk" for the orignal DLL. Though I can also understand the idea behind locking ones files, I kind of put that in the same basket as a language enforcing "private" variables. It just increases the burden of making that particular choice, regardless of how appropriate it may (or may not) be. I suppose that's mostly academic anyway - I'm not sure that invoking pip from within the repl is *really* the best solution to getting packages installed into the correct Python anyway. -W
On 16 November 2015 at 15:04, Wayne Werner
I suspect it makes life simple (which is better than complex). My personal assumption about DLL loading would be that it would follow the same pattern as Python importing modules - it's loaded once from disk at the first time it's imported, and it never goes back "to disk" for the orignal DLL.
On Windows, DLL loads map the DLL into the code space of the process. Which is why you don't want to change it. (Without some sort of copy on write, which has its own consequences). Basically, it's a trade-off that's handled differently between the two operating systems. We can argue forever over which is "best" without reaching any useful conclusion. And we're way off topic anyway, so let's leave it there :-) Paul
On Mon, Nov 16, 2015 at 01:38:23PM +0000, Paul Moore wrote:
I don't know what Unix does, I suspect it retains an old copy of the shared library for the process until the process exists, in which case you'd see a different issue, that you do an upgrade, but your process still uses the old code till you restart.
Basically. Technically, both Linux and Windows won't let you write to a shared library you have mapped into a process's address space for execution. (You get an -ETEXT error on Linux, which one can observer if one tries to re-create virtualenv while its bin/python is currently running.) What you can do Linux that you cannot do on Windows is delete a shared library file while it's mapped into a process's address space. Then Linux lets you create a new file with the same name, while the old file stays around, nameless, until it's no longer used, at which point the disk space gets garbage-collected. (If we can call reference counting "garbage collection".) The result is as you said: existing processes keep running the old code until you restart them. There are tools (based on lsof, AFAIU) that check for this situation and remind you to restart daemons. Marius Gedminas -- We like stress testing, because we know the future will be stressful. -- Maritza Mendez
On 16 November 2015 at 23:38, Paul Moore
(I don't know what Unix does, I suspect it retains an old copy of the shared library for the process until the process exists, in which case you'd see a different issue, that you do an upgrade, but your process still uses the old code till you restart).
Marius explained the lower level technical details, but the relevant API at the Python level is the "fileno()" method on file-like objects: once you have a file descriptor, you can access the kernel object representing the open file directly, and the kernel doesn't care if the original filesystem path has been remapped to refer to something else. The persistent identifier at the filesystem level is the inode number, rather than the filesystem path. After opening a file, the inode numbers match:
f = open("example", "w") os.stat("example").st_ino 244985 os.stat(f.fileno()).st_ino 244985
The filesystem's reference to the inode can be dropped, without losing the kernel's reference:
os.remove("example") os.stat("example").st_ino Traceback (most recent call last): File "<stdin>", line 1, in <module> FileNotFoundError: [Errno 2] No such file or directory: 'example' os.stat(f.fileno()).st_ino 244985
The original filesystem path can then be mapped to a new inode:
f2 = open("example", "w") os.stat("example").st_ino 242960 os.stat(f.fileno()).st_ino 244985 os.stat(f2.fileno()).st_ino 242960
As Wayne noted, the fact shared libraries can be overwritten while processes are using them is then just an artifact of this general property of *nix style filesystem access. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, Nov 13, 2015 at 3:09 PM, Nathaniel Smith
On Nov 13, 2015 12:00 PM, "Alexander Walters"
wrote: import pip pip.install(PACKAGESPEC)
something like that?
This would be extremely handy if it could be made to work reliably... But I'm skeptical about whether it can be made to work reliably. Consider all the fun things that could happen once you start upgrading packages while python is running, and might e.g. have half of an upgraded package already loaded into memory. It's like the reloading problem but even more so.
Sorry to resurrect an old thread, but I have an idea about how to do this somewhat safely, at least insofar as the running interpreter is concerned. It's still a terrible idea. Not such a terrible idea in principle, but as a practical matter in the context of Python it's probably a bad idea because it uses yet-another-.pth-hack. Consider a "partial install", wherein pip installs all files into a non-imported subdirectory of the target site-packages, along with a .pth file. This distribution is then considered "partially installed" in that the files are their (whether extracted from a wheel, or installed via distutils and the appropriate --root option or similar). For example, consider running
pip.install('requests')
It would be up to the pip.install() command to determine whether or not the requests distribution was already installed. If it's not install it would proceed as normal. For now I'm assuming the user would still have to manually run `import requests` after this. Auto-import would be nice, but is a separate issue. Now, if requests were already installed and imported we don't want to clobber the existing requests running in the interpreter. pip would install install into the relevant site-packages: <...>/site-packages/requests-2.8.1.part/ requests/ requests-2.8.1.dist-info/ <...>/site-packages/requests-2.8.1.part.pth The .part/ directory contains the results of the partial installation (for example the contents of the wheel, for wheel installs). The .part.pth file is trickier, but could be something like this: $ cat requests-2.8.1.part.pth import inspect, shutil, sys, os, atexit;p = inspect.currentframe().f_locals['sitedir'];part = os.path.join(p, 'requests-2.8.1.part');files = os.path.isdir(part) and os.listdir(part);files and list(map(lambda s, d, f: (sys.modules['shutil'].rmtree(os.path.join(d, f), sys.modules['shutil'].move(os.path.join(s, f), os.path.join(d, f))), [part] * len(files), [p] * len(files), files));os.rmdir(part);pth = part + '.pth';os.path.isfile(pth) and atexit.register(os.unlink, os.path.abspath(pth)) This rifles through the contents of requests.2.8.1.part, deletes any existing directories in the parent site-packages of the same name, completes the install by moving the contents of the .part/ directory into the correct location and then deletes the .part/ directory. The .part.pth later deletes itself. By the time the user restarts the interpreter and runs `import requests` this will be completed. Obvious it would have to be communicated to the user that to upgrade an existing package they will have to restart the interpreter which is less than ideal, but relates to a deeper limitation of Python that they should get used to anyways. At least this would enable in-process installs/upgrades. There are of course all kinds of problems with this solution too. It should perhaps only work in a virtualenv and/or .local site-packages (or at least somewhere that the user will have write permissions on the next interpreter run), and probably other error handling too. The above .pth file could also be simplified by invoking a function in pip to complete any partial installs. Erik
On Mon, Nov 16, 2015 at 6:25 PM, Marius Gedminas
What you can do Linux that you cannot do on Windows is delete a shared library file while it's mapped into a process's address space. Then Linux lets you create a new file with the same name, while the old file stays around, nameless, until it's no longer used, at which point the disk space gets garbage-collected. (If we can call reference counting "garbage collection".)
The result is as you said: existing processes keep running the old code until you restart them. There are tools (based on lsof, AFAIU) that check for this situation and remind you to restart daemons.
Not sure what exactly was going on but whenever I did that on linux I got the most peculiar segfaults and failures. It is certainly not a safe thing to do, even if linux lets you do it. Thanks, -- Ionel Cristian Mărieș, http://blog.ionelmc.ro
On Tue, 08 Dec 2015 08:56:49 +0200, contact@ionelmc.ro wrote:
On Mon, Nov 16, 2015 at 6:25 PM, Marius Gedminas
wrote: What you can do Linux that you cannot do on Windows is delete a shared library file while it's mapped into a process's address space. Then Linux lets you create a new file with the same name, while the old file stays around, nameless, until it's no longer used, at which point the disk space gets garbage-collected. (If we can call reference counting "garbage collection".)
The result is as you said: existing processes keep running the old code until you restart them. There are tools (based on lsof, AFAIU) that check for this situation and remind you to restart daemons.
Not sure what exactly was going on but whenever I did that on linux I got the most peculiar segfaults and failures. It is certainly not a safe thing to do, even if linux lets you do it.
I'm not sure what you did, because to my understanding it certainly should be safe on linux, at least on posix compliant file systems. --David
On Tue, 08 Dec 2015 08:56:49 +0200, contact@ionelmc.ro wrote:
Not sure what exactly was going on but whenever I did that on linux I got the most peculiar segfaults and failures. It is certainly not a safe thing to do, even if linux lets you do it.
Are you sure you were actually unlinking the old file and creating a new one, rather than overwriting the existing file? The latter would certainly cause trouble if you were able to do it. -- Greg
On Wed, Dec 9, 2015 at 12:51 AM, Greg Ewing
Are you sure you were actually unlinking the old file and creating a new one, rather than overwriting the existing file? The latter would certainly cause trouble if you were able to do it.
I had two instances of this problem: - pip upgrading (pip removes old version first) some package with C extensions while processes using that still runs - removing (yes, rm -rf, not inplace) and recreating a virtualenv while processes using that still runs It's wrong to think "should be safe on linux". Linux lets you do very stupid things. But that don't make them right or feasible to do in the general case. You can do it, sure, but the utility and safety are limited and very specific in scope. You gotta applaud Windows for getting this right. Thanks, -- Ionel Cristian Mărieș, http://blog.ionelmc.ro
On Tue, Dec 8, 2015 at 3:23 PM, Ionel Cristian Mărieș
On Wed, Dec 9, 2015 at 12:51 AM, Greg Ewing
wrote: Are you sure you were actually unlinking the old file and creating a new one, rather than overwriting the existing file? The latter would certainly cause trouble if you were able to do it.
I had two instances of this problem:
- pip upgrading (pip removes old version first) some package with C extensions while processes using that still runs - removing (yes, rm -rf, not inplace) and recreating a virtualenv while processes using that still runs
It's wrong to think "should be safe on linux". Linux lets you do very stupid things. But that don't make them right or feasible to do in the general case.
You can do it, sure, but the utility and safety are limited and very specific in scope. You gotta applaud Windows for getting this right.
It's true that this feature of Unix filesystems doesn't automatically make all forms of upgrade safe; in particular, it breaks in cases where an already-running process needs to open some sort of resource/plugin file, and an upgrade process has removed the file or replaced it with an incompatible one in between when the program was started and when it tried to access the resource. But, seriously, I've been swapping out libraries like libc on running systems on a weekly basis for years (this is pretty standard for debian users), and it basically just works. It's definitely better to reboot after such upgrades to make sure that the new version is in use (e.g. a new version of openssl with security fixes), and to avoid issues like the ones described in the previous paragraph, but generally speaking it's easily possible to have a program that runs fine despite its virtualenv having been deleted out from under it -- the rule is simply that any open/mmap'ed file will continue, perfectly reliably, to refer to the original file until the program exits, even if that file no longer has a name in the filesystem (which is what rm does). A common example of where you can get weirdness in Python is that Python waits until it has to actually print a traceback before loading the original source code (.py file -- most of the time it just uses the .pyc file), so if you upgrade a python library in-place then existing processes will continue to execute the original code and show correct file names and line numbers in tracebacks, but the actual source lines printed in tracebacks will be incorrect. I don't really care about trying to rank Windows vs Unix as being "better", obviously there are trade-offs here. (Though it would be nice if Windows had SOME more reasonable solution to the upgrade problem.) Just want to make sure that the actual semantics here are clear -- there's nothing mysterious about the Unix semantics, and it's pretty easy to predict what will work and what won't once you understand what's going on. -n -- Nathaniel J. Smith -- http://vorpus.org
On Wed, Dec 9, 2015 at 2:18 AM, Nathaniel Smith
Just want to make sure that the actual semantics here are clear -- there's nothing mysterious about the Unix semantics, and it's pretty easy to predict what will work and what won't once you understand what's going on.
You don't have any guarantees that running process won't try to use stuff from disk later on do you? If it segfaults (and it does in my "general usecases") it's hard to debug - you got nothing conveniently on disk. And no, "upgrading libc" is not a general usecase, it's just one of those few things that work because they were written in a very specific way, and you should not apply that technique in the general usecase. If you want, I can provide you some reproducers but lets not continue this "but, seriously, it works fine for me" kind of discussion. Thanks, -- Ionel Cristian Mărieș, http://blog.ionelmc.ro
On Tue, Dec 8, 2015 at 7:10 PM, Ionel Cristian Mărieș
On Wed, Dec 9, 2015 at 2:18 AM, Nathaniel Smith
wrote: Just want to make sure that the actual semantics here are clear -- there's nothing mysterious about the Unix semantics, and it's pretty easy to predict what will work and what won't once you understand what's going on.
You don't have any guarantees that running process won't try to use stuff from disk later on do you? If it segfaults (and it does in my "general usecases") it's hard to debug - you got nothing conveniently on disk.
Yes, exactly: if a running process calls 'open' after a file has been replaced then it gets the new file; if it calls 'open' before a file has been replaced then it gets the old file (and keeps it until it calls 'close', even if it's deleted or renamed-over in the mean time). There's nothing intrinsically segfaulty about this, but sure, if you write your program in such a way that it (a) opens a file while running, and (b) segfaults if the file it wants to open is missing or from the wrong version, then yeah, this will trigger that segfault. Probably it would be better to write your program so that missing or corrupted files produce a more controlled error rather than a segfault, and it would then be more robust regardless of upgrade issues, but I can certainly believe that such buggy programs exist.
And no, "upgrading libc" is not a general usecase, it's just one of those few things that work because they were written in a very specific way, and you should not apply that technique in the general usecase.
It's the general technique that Linux systems always use when upgrading executables and shared libraries, which are the cases that Windows handles differently, so it's probably worth understanding, is all. -n -- Nathaniel J. Smith -- http://vorpus.org
participants (15)
-
Alexander Walters
-
Chris Barker
-
Chris Barker - NOAA Federal
-
Erik Bray
-
Greg Ewing
-
Ines Barata
-
Ionel Cristian Mărieș
-
Marius Gedminas
-
Nathaniel Smith
-
Nick Coghlan
-
Oscar Benjamin
-
Paul Moore
-
R. David Murray
-
Robert McGibbon
-
Wayne Werner