From mwm at mired.org Mon Feb 13 22:30:03 2012 From: mwm at mired.org (Mike Meyer) Date: Mon, 13 Feb 2012 16:30:03 -0500 Subject: [concurrency] Issues with the current concurrency mechanisms in Python Message-ID: <20120213163003.78f08f40@bhuda.mired.org> General: All concurrency is built on top of the OS primitives, with little if any higher-level constructs hiding the details of managing concurrency. The exceptions are the queue modules. We need better tools. Mutating unprotected shared memory (always an error) passes silently, which is unpythonic. Processes: There's no portable way to get named shared memory. There's no way to get general python objects into shared memory. Reference counting causes COW pages to be copied unnecessarily. Threads: Most of the standard library has not been audited for thread safety. Everything is shared by default. Sharing should be explicit, because "Explicit is better than implicit." The GIL. Once the GIL is gone, reference counting will slow things down. Distributed systems: [This is basically processes restricted to network communications methods.] The Processes tools are sufficient for distributed systems, except they can't be used without the addition of a rendezvous mechanism. Some proposed solutions: Add shm_open and shm_unlink to the mmap module. This fixes the issue of named shared memory for processes. Add a 'lock' option to mmap.mmap, which will create a lock attribute for the mmap object with the same API as the threading.Lock/RLock objects. This is a stopgap for putting python objects in shared memory, but fills a critical need. Auditing the standard library, not having unprotected mutations of shared objects pass silently, and having everything shared by default are related issues. The latter two require the language system to know whether an object might be exposed to concurrent access, as the answer being "no" might allow eliding the handling of it. A tool for figuring such things out would also be useful as a first step in auditing the standard library. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From mwm at mired.org Mon Feb 13 22:37:35 2012 From: mwm at mired.org (Mike Meyer) Date: Mon, 13 Feb 2012 16:37:35 -0500 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: <20120213162733.004413f2@bhuda.mired.org> References: <20120213162733.004413f2@bhuda.mired.org> Message-ID: Forgot one for threads: No pthread_atfork allowing so an application can safely fork while having threads running. From christopherreay at gmail.com Mon Feb 13 22:43:27 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Mon, 13 Feb 2012 23:43:27 +0200 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: References: <20120213162733.004413f2@bhuda.mired.org> Message-ID: Thanks for this Mike, Ill get my head around it all -------------- next part -------------- An HTML attachment was scrubbed... URL: From prologic at shortcircuit.net.au Mon Feb 13 22:49:56 2012 From: prologic at shortcircuit.net.au (James Mills) Date: Tue, 14 Feb 2012 07:49:56 +1000 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: References: <20120213162733.004413f2@bhuda.mired.org> Message-ID: <-3911509196669315413@unknownmsgid> I think one thing that could also help is micro threads and/or greenlet support built in to Python. cheers James Sent from my iPad On 14/02/2012, at 7:43, Christopher Reay wrote: > Thanks for this Mike, Ill get my head around it all > _______________________________________________ > concurrency-sig mailing list > concurrency-sig at python.org > http://mail.python.org/mailman/listinfo/concurrency-sig From taleinat at gmail.com Tue Feb 14 18:06:28 2012 From: taleinat at gmail.com (Tal Einat) Date: Tue, 14 Feb 2012 19:06:28 +0200 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: <-3911509196669315413@unknownmsgid> References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> Message-ID: On Mon, Feb 13, 2012 at 23:49, James Mills wrote: > I think one thing that could also help is micro threads and/or > greenlet support built in to Python. Indeed, that would make Python a more appealing platform for concurrency, and provide another possible route for Python projects which will need to scale to more processors in the future. Right now I'm having a difficult time promoting the use of greenlets or gevent for a Python project, and having this built in (or otherwise officially supported) would have greatly the chance of going that route. I'm guessing this is true for quite a few other projects as well. My project does quite a bit of network I/O in addition to significant computations, and Python threads have such a great overhead... greenlets / fibers would have been great! We could perhaps have used Twisted, but that was hardly even considered due to its being notoriously large, complicated and difficult to learn. - Tal Tiano Einat From christopherreay at gmail.com Wed Feb 15 08:47:37 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Wed, 15 Feb 2012 09:47:37 +0200 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> Message-ID: Greenlets, as far as I understand it, are a superprimitive structure. Co-routines, implmemented as an extention to the Interpreter. Im not sure exactly what "officially" supported means in this instance. Co-routines offer a way to circumvent the tree structure convention of program (control) flow constructs. As such, I suppose they offer a way to remodel threading, but im not sure if they could be said to approach the "issues of concurrency" that give rise to questions about locking and object sharing. I think the core of this discussion (which has lost all steam, verve, fun and substance since it was kicked out of python-ideas) is: 1. How can python support multiple external processes interacting with a single set of states (i.e. an interpreter instance) given that the GIL exists? 2. How can we improve sharing of objects between interpreter instances? Does this about sum up where we are? -------------- next part -------------- An HTML attachment was scrubbed... URL: From prologic at shortcircuit.net.au Wed Feb 15 09:17:42 2012 From: prologic at shortcircuit.net.au (James Mills) Date: Wed, 15 Feb 2012 18:17:42 +1000 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> Message-ID: On Wed, Feb 15, 2012 at 17:47, Christopher Reay wrote: > I think the core of this discussion (which has lost all steam, verve, fun > and substance since it was kicked out of python-ideas) is: > > 1. How can python support multiple external processes interacting with > a single set of states (i.e. an interpreter instance) given that the GIL > exists? > 2. How can we improve sharing of objects between interpreter instances? > > Does this about sum up where we are? > I believe so. I think we would all agree that attempts at solving both of these problems have been implemented. multiprocessing, twisted, gevent, Kamaelia, circuits, etc. Question is can we being simple concurrency features that we can all use and appreciate? cheers James -------------- next part -------------- An HTML attachment was scrubbed... URL: From christopherreay at gmail.com Wed Feb 15 09:20:45 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Wed, 15 Feb 2012 10:20:45 +0200 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> Message-ID: Maybe that should have been "supraprimitive" structure Greenlets, as far as I understand it, are a superprimitive structure. > Co-routines, implmemented as an extention to the Interpreter. -- Be prepared to have your predictions come true -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Wed Feb 15 09:22:20 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 15 Feb 2012 03:22:20 -0500 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> Message-ID: <20120215032220.61869ac2@bhuda.mired.org> On Wed, 15 Feb 2012 09:47:37 +0200 Christopher Reay wrote: > I think the core of this discussion (which has lost all steam, verve, fun > and substance since it was kicked out of python-ideas) is: Yes, it has lost that. I think all the people who want to spout off but not do work left the discussion. From my viewpoint, there are four concurrent (sorry) issues. This is one of them. > 1. How can python support multiple external processes interacting with a > single set of states (i.e. an interpreter instance) given that the GIL > exists? > 2. How can we improve sharing of objects between interpreter instances? > Does this about sum up where we are? That's the "processes" part of the discussion. The second one is - can we "fix" threading to not have the problems it has? Summarizing that problem: 1) Memory management is so hard that so few programmers get it right that most modern languages simply don't let them do it. 2) Concurrent access to mutable objects is *much*, *much* harder than memory management. 3) Threading (whether done with OS or userland threads) expose every object in the program to concurrent access. The languages I've dealt with that fixed this did so by making mutable objects rare, and requiring special tools to mutate them. I don't think that's going to fly in Python. Which leaves tweaking the interpreter so that threads no longer expose most objects to concurrent access. My start on this is a tool to do static analysis to figure out which objects can't be exposed to concurrent access at all, since that 1) provides a start on auditing the standard library for thread safety, and 2) gives us a handle how big the problem really is. All I have to do is find time to work on it... http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From christopherreay at gmail.com Wed Feb 15 12:11:27 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Wed, 15 Feb 2012 13:11:27 +0200 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: <20120215032220.61869ac2@bhuda.mired.org> References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> <20120215032220.61869ac2@bhuda.mired.org> Message-ID: Have you checked out the stuff on STM going on in pypy-dev? There are (and im paraphrasing, but ill try and keep it shallow) something like 4 styles of STM in current pure CS thinking (STM being quite new). The discussion on that thread has arrived at something like "its better to implicitly have all mutations within transactions" and then allow those transactions to attempt to automatically resolve any conflicts. 2) Concurrent access to mutable objects is *much*, *much* harder than > memory management. > 3) Threading (whether done with OS or userland threads) expose every object > in the program to concurrent access. This is in contrast with what you mentioned here, about concurrent access to objects, where, in the standard threading/lock model, all mutations are carried out in a non-deterministic space (which is generally not what people want. It can take developers some time to learn and understand that they are operating in a non-deterministic space which isnt convergent with their imagination of what "should" be happening). If you think it is in topic, Ill go and try and dig out the STM thread from pypy an post it somewhere Also I thought that somewhere in this thread is buried the idea that processes and threads are different ("but equal") models of concurrency. Threads suffer from the problem that (by default) "all objects are accesible [for concurrent mutation]" and processes that "no objects are accesible [for concurrent mutation]" As far as I can dig it, the requirement for managing concurrent access to objects within a threaded application is some kind of reflection of the requirement for sharing concurrent access to objects across processes. The issues here are that threading (certainly for Python) is fundamentally limited to running on a single core, and can require dancing around the GIL if multiple processes are accessing our python state concurrently. Christopher (must get back to work) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Wed Feb 15 12:28:35 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 15 Feb 2012 06:28:35 -0500 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> <20120215032220.61869ac2@bhuda.mired.org> Message-ID: <20120215062835.67e1d1d4@bhuda.mired.org> On Wed, 15 Feb 2012 13:11:27 +0200 Christopher Reay wrote: > Have you checked out the stuff on STM going on in pypy-dev? I've seen some of the comments on it, and have checking on what's really going on in the list of things to do. > If you think it is in topic, Ill go and try and dig out the STM thread from > pypy an post it somewhere I certainly think it is, *especially* if they're going to expose STM at the Python level. Adding better tools for dealing with concurrency in general is the fourth large goal. It may be the hardest to deal with, as it certainly involves language changes. But it may be required for some of the others as well. > > 2) Concurrent access to mutable objects is *much*, *much* harder than > > memory management. > > 3) Threading (whether done with OS or userland threads) expose every object > > in the program to concurrent access. > This is in contrast with what you mentioned here, about concurrent access > to objects, where, in the standard threading/lock model, all mutations are > carried out in a non-deterministic space (which is generally not what > people want. It can take developers some time to learn and understand that > they are operating in a non-deterministic space which isnt convergent with > their imagination of what "should" be happening). Right. STM is one of the possible fixes. I'm used to seeing it with *other* tools as well, though. > As far as I can dig it, the requirement for managing concurrent access to > objects within a threaded application is some kind of reflection of the > requirement for sharing concurrent access to objects across processes. IIUC, you're right. Concurrent access is a PITA to deal with - at least if you have to lock everything by hand. It doesn't matter how that happens, whether it's because threading exposes everything, or you're using explicitly shared objects between processes. Threading makes it worse because *everything* is shared, so you have to worry about it everywhere. I've seen a couple of cases where module A decides it wants to use threads for something, then module B starts having concurrency problems because it's not thread safe. > The issues here are that threading (certainly for Python) is > fundamentally limited to running on a single core, and can require > dancing around the GIL if multiple processes are accessing our > python state concurrently. This I either don't parse, or don't agree with. Python code can't run concurrently with other Python code - the GIL prevents it. PyPy with STM or Jython (which uses fine-grained locks) and C extensions in cPython that release the GIL can run concurrently, potentially on multiple cores. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From christopherreay at gmail.com Wed Feb 15 13:19:48 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Wed, 15 Feb 2012 14:19:48 +0200 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: <20120215062835.67e1d1d4@bhuda.mired.org> References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> <20120215032220.61869ac2@bhuda.mired.org> <20120215062835.67e1d1d4@bhuda.mired.org> Message-ID: > > > The issues here are that threading (certainly for Python) is > > fundamentally limited to running on a single core, and can require > > dancing around the GIL if multiple processes are accessing our > > python state concurrently. > This I either don't parse, or don't agree with. Python code can't run > concurrently with other Python code - the GIL prevents it. PyPy with > STM or Jython (which uses fine-grained locks) and C extensions in > cPython that release the GIL can run concurrently, potentially on > multiple cores. I dont think I parse what you said either. Perhaps we are saying the same thing... Ill have a deeper read of this later and see if I can get it into my brain -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Wed Feb 15 16:52:38 2012 From: shibturn at gmail.com (shibturn) Date: Wed, 15 Feb 2012 15:52:38 +0000 Subject: [concurrency] Adding shm_open to mmap Message-ID: <4F3BD4C6.80809@gmail.com> On Tue, 14 Feb 2012 23:10:11 -0500 Mike Meyer wrote: > I'd prefer to provide shm_open on Windows if at all possible. The > "sorta-kinda" bothers me. That would also allow for an application to > exit and then resume work stored in a mapped segment (something I've > done before). However, setting this up on Windows isn't something I > can do. Here is a proof-of-concept for shm_open/shm_unlink on Windows. Note that shm_unlink opens the file using FILE_FLAG_DELETE_ON_CLOSE which ensures that subsequent attempts to open the file will not succeed. However, the directory entry will not disappear till all handles have been closed. FILE_ATTRIBUTE_TEMPORARY tells the system we want to try to cache the file without worrying about flushing. sbt import os import msvcrt import tempfile from _multiprocessing import win32 DEV_SHM = tempfile.gettempdir() GENERIC_READ = win32.GENERIC_READ GENERIC_WRITE = win32.GENERIC_WRITE CREATE_NEW = 1 CREATE_ALWAYS = 2 OPEN_EXISTING = 3 OPEN_ALWAYS = 4 TRUNCATE_EXISTING = 5 FILE_SHARE_READ = 1 FILE_SHARE_WRITE = 2 FILE_SHARE_DELETE = 4 FILE_ATTRIBUTE_TEMPORARY = 256 FILE_FLAG_DELETE_ON_CLOSE = 0x04000000 NULL = 0 _CreationDisposition = { os.O_CREAT | os.O_EXCL: CREATE_NEW, os.O_CREAT | os.O_TRUNC: CREATE_ALWAYS, 0: OPEN_EXISTING, os.O_CREAT: OPEN_ALWAYS, os.O_TRUNC: TRUNCATE_EXISTING } _DesiredAccess = { os.O_RDONLY: GENERIC_READ, os.O_RDWR: GENERIC_READ | GENERIC_WRITE } _OsfDesiredAccess = { os.O_RDONLY: os.O_RDONLY, os.O_RDWR: 0 } def _get_path(name, dev_shm): name = name.lstrip('/') if '/' in name or '\\' in name: raise ValueError('invalid name') return os.path.join(dev_shm, name) def shm_open(name, flags, mode, dev_shm=DEV_SHM): # Opening with FILE_SHARE_READ | FILE_SHARE_WRITE | # FILE_SHARE_DELETE ensures that other processes can open the file # and pseudo-unlink it while it is still in use. path = _get_path(name, dev_shm) da_flags = flags & (os.O_RDONLY | os.O_RDWR) cd_flags = flags & ~(os.O_RDONLY | os.O_RDWR) h = win32.CreateFile( path, _DesiredAccess[da_flags], FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, _CreationDisposition[cd_flags], FILE_ATTRIBUTE_TEMPORARY, NULL) try: os.chmod(path, mode) return msvcrt.open_osfhandle(h, _OsfDesiredAccess[da_flags]) except: win32.CloseHandle(h) raise def shm_unlink(name, dev_shm=DEV_SHM): # Opening with FILE_FLAG_DELETE_ON_CLOSE will make subsequent # attempts to open the file fail. However, the name is not # removed from the file system until all handles have been removed. path = _get_path(name, dev_shm) h = win32.CreateFile( path, GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_TEMPORARY | FILE_FLAG_DELETE_ON_CLOSE, NULL) win32.CloseHandle(h) ## if __name__ == '__main__': import mmap MYMAP = "mymmap" fd = shm_open(MYMAP, os.O_RDWR | os.O_CREAT | os.O_EXCL, 0o600) m = mmap.mmap(fd, 10) os.close(fd) m[:5] = "hello" fd = shm_open(MYMAP, os.O_RDWR, 0o600) n = mmap.mmap(fd, 10) os.close(fd) n[:] = n[:].upper() n.close() assert os.path.exists(_get_path(MYMAP, DEV_SHM)) shm_unlink(MYMAP) try: fd = shm_open(MYMAP, os.O_RDWR, 0o600) except OSError: pass else: raise AssertionError("expected access denied") print repr(m[:]) m.close() assert not os.path.exists(_get_path(MYMAP, DEV_SHM)) From mwm at mired.org Thu Feb 16 05:57:15 2012 From: mwm at mired.org (Mike Meyer) Date: Wed, 15 Feb 2012 23:57:15 -0500 Subject: [concurrency] Adding shared objects to Python? Message-ID: <20120215235715.7ef3bb75@bhuda.mired.org> Ok, this is an off-the-wall idea. I have no idea if it's even feasible, so I'm asking here. I'm hoping that someone familiar enough with the insides of Python to know if it'll work can comment on it. Can we add a mechanism to be named later to Python that lets us tell the interpreter that some object will be shared, and then have the interpreter put it in an mmap'ed segment that will be/is shared via subprocess? For some objects - the builtin non-container types, for instance - this trivially works. For others, it's not so clear. What happens if I create a shared object that contains non-shared objects? How does time of creation of the object and the sharing subprocess play into this? Can I create an shared instance of a class that's not shared? Basically, it looks like a mess to me, but maybe someone smarter than I who knows the interpreter internals can figure out an implementable set of constraints that's still useful. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From mwm at mired.org Thu Feb 16 06:07:46 2012 From: mwm at mired.org (Mike Meyer) Date: Thu, 16 Feb 2012 00:07:46 -0500 Subject: [concurrency] libdispatch wrapper? Message-ID: <20120216000746.1f1c7aaa@bhuda.mired.org> Anyone know if there's a Python wrapper for libdispatch? Anyone interested in working on one? Seems like that would be a good tool to have for doing concurrent work! Thanks, http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From 8mayday at gmail.com Thu Feb 16 10:22:36 2012 From: 8mayday at gmail.com (Andrey Popp) Date: Thu, 16 Feb 2012 13:22:36 +0400 Subject: [concurrency] Issues with the current concurrency mechanisms in Python In-Reply-To: <-3911509196669315413@unknownmsgid> References: <20120213162733.004413f2@bhuda.mired.org> <-3911509196669315413@unknownmsgid> Message-ID: <20120216092236.GA60598@work-mpb.zvq.me> On Tue, Feb 14, 2012 at 07:49:56AM +1000, James Mills wrote: > I think one thing that could also help is micro threads and/or > greenlet support built in to Python. I'm thinking more of allowing Python interpreter to have replaceable I/O and threading runtimes, like PyPy or JVM can replace garbage collection mechanisms using command line switch. So we can have same code base working under native OS threads or coroutines implemented with greenlets or POSIX {get,make,set,swap}context functions or whatever else. Same for I/O -- socket/time/signal/... modules should be provided by active runtime so no monkey patching (like gevent does) should be happened any more. From seb.binet at gmail.com Thu Feb 16 13:59:23 2012 From: seb.binet at gmail.com (Sebastien Binet) Date: Thu, 16 Feb 2012 13:59:23 +0100 Subject: [concurrency] libdispatch wrapper? In-Reply-To: <20120216000746.1f1c7aaa@bhuda.mired.org> References: <20120216000746.1f1c7aaa@bhuda.mired.org> Message-ID: <87obszexxw.fsf@cern.ch> Mike, On Thu, 16 Feb 2012 00:07:46 -0500, Mike Meyer wrote: > Anyone know if there's a Python wrapper for libdispatch? Anyone > interested in working on one? Seems like that would be a good tool to > have for doing concurrent work! I wrote this very simple minded ctypes-based wrapper last year. haven't dusted it off, though... https://bitbucket.org/binet/py-libdispatch/overview hth, -s -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: From mwm at mired.org Sat Feb 18 14:19:01 2012 From: mwm at mired.org (Mike Meyer) Date: Sat, 18 Feb 2012 08:19:01 -0500 Subject: [concurrency] Talk worth watching Message-ID: <20120218081901.11cf3dc0@bhuda.mired.org> If you haven't seen it, this is probably worth a gander: http://yow.eventer.com/events/1004/talks/1055 It's a research scientist discussing his plans for concurrency in Haskell. I think his goals - and ideas - are pretty solid. However, since he's doing research, he can afford to use an obscure language that makes his life easier. Getting this stuff working in a practical language like Python will be a bit more difficult. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From 8mayday at gmail.com Sat Feb 18 18:02:26 2012 From: 8mayday at gmail.com (Andrey Popp) Date: Sat, 18 Feb 2012 21:02:26 +0400 Subject: [concurrency] Talk worth watching In-Reply-To: <20120218081901.11cf3dc0@bhuda.mired.org> References: <20120218081901.11cf3dc0@bhuda.mired.org> Message-ID: <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> On 18.02.2012, at 17:19, Mike Meyer wrote: > If you haven't seen it, this is probably worth a gander: > http://yow.eventer.com/events/1004/talks/1055 > > It's a research scientist discussing his plans for concurrency in > Haskell. I think his goals - and ideas - are pretty solid. These thing are already implemented in Haskell and more or less practically usable. > However, > since he's doing research, he can afford to use an obscure language > that makes his life easier. Getting this stuff working in a practical > language like Python will be a bit more difficult. > > -- > Mike Meyer http://www.mired.org/ > Independent Software developer/SCM consultant, email for more information. > > O< ascii ribbon campaign - stop html mail - www.asciiribbon.org > _______________________________________________ > concurrency-sig mailing list > concurrency-sig at python.org > http://mail.python.org/mailman/listinfo/concurrency-sig From jeremy.mcmillan at gmail.com Sat Feb 18 19:01:43 2012 From: jeremy.mcmillan at gmail.com (Jeremy McMillan) Date: Sat, 18 Feb 2012 12:01:43 -0600 Subject: [concurrency] Talk worth watching In-Reply-To: <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> Message-ID: I've been trying to figure this stuff out for a couple of years, and this is as far as I've got in my musings. Python has lots of functional programming features, but even in functional languages, "side effects" impacting state outside the function's internal scope require "heroic" efforts (like locking or message passing/serialization) to avoid breaking parallel execution or concurrency/responsiveness. The declarative programming style makes it trivial to graph which code cares about which objects in memory. That forces the programmer to solve the problems a priori, which as it turns out, are the real issue with effectively getting rid of the GIL in Python, thus ending this debate. http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html What we need is smarter memory management, but I think it needs to be somewhere between the Spartan approach of functional languages and the relatively cushy GC'ed object-oriented languages. I wonder if there isn't a perspective available from the AST or something that would allow cherry-picking sections of logic which can be decomposed more or less like what you'd end up with from strict functional code, and possibly executing them in another context (interpreter context, thread, greenlet?) which would allow releasing the GIL. Or possibly the opposite, and cherry pick sections of code which must acquire the GIL? The end result is that when you program in Python, but you're careful to code in a style that allows embarrassingly parallel execution, you get some reprieve from the GIL. There has to be some way to guarantee that no code outside your intended parallel context can ever refer to anything inside, and then side-effects when you can't avoid them, will block waiting for the GIL. One practical example would be that using map() and reduce() in Python would automatically parallelize execution across multiple CPUs. On Sat, Feb 18, 2012 at 11:02 AM, Andrey Popp <8mayday at gmail.com> wrote: > On 18.02.2012, at 17:19, Mike Meyer wrote: > > > If you haven't seen it, this is probably worth a gander: > > http://yow.eventer.com/events/1004/talks/1055 > > > > It's a research scientist discussing his plans for concurrency in > > Haskell. I think his goals - and ideas - are pretty solid. > > These thing are already implemented in Haskell and more or less > practically usable. > > > However, > > since he's doing research, he can afford to use an obscure language > > that makes his life easier. Getting this stuff working in a practical > > language like Python will be a bit more difficult. > > > > > -- > > Mike Meyer http://www.mired.org/ > > Independent Software developer/SCM consultant, email for more > information. > > > > O< ascii ribbon campaign - stop html mail - www.asciiribbon.org > > _______________________________________________ > > concurrency-sig mailing list > > concurrency-sig at python.org > > http://mail.python.org/mailman/listinfo/concurrency-sig > _______________________________________________ > concurrency-sig mailing list > concurrency-sig at python.org > http://mail.python.org/mailman/listinfo/concurrency-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christopherreay at gmail.com Sat Feb 18 19:32:00 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Sat, 18 Feb 2012 20:32:00 +0200 Subject: [concurrency] Talk worth watching In-Reply-To: References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> Message-ID: fascinating captain :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ubershmekel at gmail.com Sat Feb 18 20:14:39 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Sat, 18 Feb 2012 21:14:39 +0200 Subject: [concurrency] Talk worth watching In-Reply-To: <20120218081901.11cf3dc0@bhuda.mired.org> References: <20120218081901.11cf3dc0@bhuda.mired.org> Message-ID: On Sat, Feb 18, 2012 at 3:19 PM, Mike Meyer wrote: > If you haven't seen it, this is probably worth a gander: > http://yow.eventer.com/events/1004/talks/1055 > > It's a research scientist discussing his plans for concurrency in > Haskell. I think his goals - and ideas - are pretty solid. However, > since he's doing research, he can afford to use an obscure language > that makes his life easier. Getting this stuff working in a practical > language like Python will be a bit more difficult. > > This is pretty much a fanboy rant on Haskell and STM, I did like "parMap", here's a python implementation: from multiprocessing import Pool def f(x): return x*x if __name__ == "__main__": p = Pool(5) print(p.map(f, [1,2,3])) A problem with that for the functional guys is it doesn't work with lambdas or closures. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aahz at pythoncraft.com Sun Feb 19 17:54:44 2012 From: aahz at pythoncraft.com (Aahz) Date: Sun, 19 Feb 2012 08:54:44 -0800 Subject: [concurrency] Talk worth watching In-Reply-To: References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> Message-ID: <20120219165444.GA16308@panix.com> On Sat, Feb 18, 2012, Jeremy McMillan wrote: > > The declarative programming style makes it trivial to graph which code > cares about which objects in memory. That forces the programmer to solve > the problems a priori, which as it turns out, are the real issue with > effectively getting rid of the GIL in Python, thus ending this debate. > > http://dabeaz.blogspot.com/2011/08/inside-look-at-gil-removal-patch-of.html > > What we need is smarter memory management, but I think it needs to be > somewhere between the Spartan approach of functional languages and the > relatively cushy GC'ed object-oriented languages. As Dave's post and comments make clear, a lot of the problem with switching Python's memory management comes from interfacing with external libraries. As long as one of Python's primary goals is to make it easy to hook up external libraries, it will be difficult to get enough energy toward removing the GIL. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Do not taunt happy fun for loops. Do not change lists you are looping over." --Remco Gerlich From solipsis at pitrou.net Sun Feb 19 17:57:09 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 Feb 2012 17:57:09 +0100 Subject: [concurrency] Talk worth watching In-Reply-To: <20120219165444.GA16308@panix.com> References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> <20120219165444.GA16308@panix.com> Message-ID: <1329670629.3442.3.camel@localhost.localdomain> > As Dave's post and comments make clear, a lot of the problem with > switching Python's memory management comes from interfacing with external > libraries. As long as one of Python's primary goals is to make it easy > to hook up external libraries, it will be difficult to get enough energy > toward removing the GIL. That's not necessarily a big problem. The ob_refcnt field could be kept for a certain duration as an "external references counter" that would prevent any disposal by the GC, until all 3rd party libraries have moved to hypothetical new APIs. In other words, reference counting doesn't have to be totally dropped, it has to be removed from the interpreter's critical paths (such as the eval loop, and many operations of builtin types). From guido at python.org Sun Feb 19 18:39:34 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 19 Feb 2012 09:39:34 -0800 Subject: [concurrency] Talk worth watching In-Reply-To: <1329670629.3442.3.camel@localhost.localdomain> References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> <20120219165444.GA16308@panix.com> <1329670629.3442.3.camel@localhost.localdomain> Message-ID: On Sun, Feb 19, 2012 at 8:57 AM, Antoine Pitrou wrote: > >> As Dave's post and comments make clear, a lot of the problem with >> switching Python's memory management comes from interfacing with external >> libraries. ?As long as one of Python's primary goals is to make it easy >> to hook up external libraries, it will be difficult to get enough energy >> toward removing the GIL. > > That's not necessarily a big problem. The ob_refcnt field could be kept > for a certain duration as an "external references counter" that would > prevent any disposal by the GC, until all 3rd party libraries have moved > to hypothetical new APIs. > > In other words, reference counting doesn't have to be totally dropped, > it has to be removed from the interpreter's critical paths (such as the > eval loop, and many operations of builtin types). This suggestion might actually show the way to gradually deprecating the GIL and reference counts -- they can be kept as a crutch for C code using the traditional Python/C API, while the core of interpreter itself can move on to a different approach. Right now we're in the situation where the core interpreter itself uses the traditional API, but it wouldn't be impossible to switch to a new API while still supporting the old one. Of course, there are lots of details to be figured out, like what should happen at the boundaries. IIUC PyPy uses a limited version of this approach (specifically with the purpose of supporting extension modules) but it uses incompatible representations for two two APIs, so crossing the border is a lot of work. -- --Guido van Rossum (python.org/~guido) From mwm at mired.org Wed Feb 22 00:19:26 2012 From: mwm at mired.org (Mike Meyer) Date: Tue, 21 Feb 2012 18:19:26 -0500 Subject: [concurrency] Talk worth watching In-Reply-To: References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> Message-ID: <20120221181926.405d8356@bhuda.mired.org> On Sat, 18 Feb 2012 12:01:43 -0600 Jeremy McMillan wrote: > I wonder if there isn't a perspective available from the AST or something > that would allow cherry-picking sections of logic which can be decomposed > more or less like what you'd end up with from strict functional code, and > possibly executing them in another context (interpreter context, thread, > greenlet?) which would allow releasing the GIL. Or possibly the opposite, > and cherry pick sections of code which must acquire the GIL? It certainly looks possible. You don't really need "functional code", just code that you can guarantee contains no references to variables that might be shared. A tool that simply checked functions for being "thread safe" in this way would have a number of uses - starting with deciding how useful such a tool would be in practice. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From christopherreay at gmail.com Wed Feb 22 08:14:34 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Wed, 22 Feb 2012 09:14:34 +0200 Subject: [concurrency] Talk worth watching In-Reply-To: <20120221181926.405d8356@bhuda.mired.org> References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> <20120221181926.405d8356@bhuda.mired.org> Message-ID: I like this idea -------------- next part -------------- An HTML attachment was scrubbed... URL: From christopherreay at gmail.com Wed Feb 22 08:15:20 2012 From: christopherreay at gmail.com (Christopher Reay) Date: Wed, 22 Feb 2012 09:15:20 +0200 Subject: [concurrency] Talk worth watching In-Reply-To: References: <20120218081901.11cf3dc0@bhuda.mired.org> <523DDE7B-8375-4B2E-9606-1FF118235922@gmail.com> <20120221181926.405d8356@bhuda.mired.org> Message-ID: Id be happy to understudy someone on this, if the project is big enough. I have limited time but would love to work on a project like this Cheistopehr On 22 February 2012 09:14, Christopher Reay wrote: > I like this idea > -- Be prepared to have your predictions come true -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Mon Feb 27 20:12:23 2012 From: mwm at mired.org (Mike Meyer) Date: Mon, 27 Feb 2012 14:12:23 -0500 Subject: [concurrency] [Python-ideas] Support other dict types for type.__dict__ In-Reply-To: References: <4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info> <4F4BCD0D.8040706@btinternet.com> Message-ID: <20120227141223.6329ab8f@bhuda.mired.org> On Mon, 27 Feb 2012 11:45:45 -0700 Mark Janssen wrote: > On Mon, Feb 27, 2012 at 11:35 AM, Rob Cliffe wrote: > > I suggested a "mutable" attribute some time ago. > > This could lead to finally doing away with one of Python's FAQs: Why does > > python have lists AND tuples? ?They could be unified into a single type. > > Rob Cliffe. > Yeah, that would be cool. It would force (ok, *allow*) the > documenting of any non-mutable attributes (i.e. when they're mutable, > and why they're being set immutable, etc.). This also has implications for people working on making python friendlier for concurrent and parallel programming. > There an interesting question, then, should the mutable bit be on the > Object itself (the whole type) or in each instance....? There's > probably no "provable" or abstract answer to this, but rather just an > organization principle to the language.... Ok, you said "non-mutable attributes" in the first paragraph. That to me implies that the object bound to that attribute can't be changed. This is different from the attribute being bound to an immutable object, which this paragraph implies. Which do you want here? http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org