On Tue, 24 May 2016 09:59:42 +0200, "M.-A. Lemburg"
As soon as you have memory in use which is not fully managed by Python, I don't think there's any way to implement transactions on memory in a meaningful way. The possible side effect in the unmanaged blocks would render such transactions meaningless, since a rollback in those would still leave you with the changes in the unmanaged blocks (other parts of the system).
In this case all the memory will "managed" at the direction of the Python program (and the extension module). The issue is that while we have transactions on the NVRAM objects, the regular python objects don't get their state restored if the transaction block aborts. Which is part of why I was wondering about what it might look like to integrate awareness of storage classes into the language itself.
Now, back on topic: for writing to NVRAM, having a transaction mechanism in place does make sense, but it would have to be clear that only the bits stored in NVRAM are subject to the transaction.
Yes, exactly.
The reason here being that a failure while writing to NVRAM could potentially cause your machine to no longer boot.
I think you misunderstand. We're not talking about "regular" NVRAM, we're talking about memory banks that are exposed to user space via a DAX driver that uses file system semantics to set up the mapping from user space to the NVRAM, but after that some kernel magic allows the user space program to write directly to the NVRAM. We're not doing this with the NVRAM involved in booting the machine, it is separate dedicated storage.
For volatile RAM, at worst, the process will die, but not have much effect on other running parts of the system, so there is less incentive to have transactions (unless, of course, you are deep into STM and want to work around the GIL :-)).
STM is a different approach, and equally valid, but not the one the underlying library takes.
Given that Armin Rigo has been working on STM for years, I'd suggest to talk to him about challenges and solutions for transactions on memory.
He's looking at what we might call the reverse of the type of transaction I'm dealing with. An STM transaction makes all changes pending, and throws them away on conflict. Our transaction makes all changes immediately, and *rolls them back* on *failure*. No conflicts are involved, so the things you have to worry about are different from the things you have to worry about in the STM case. I'm sure there are some commonalities, so it may well be worth talking to Armin, since he's thought deeply about this stuff. I'm being handed the transaction machinery by the underlying library, though, so I "only" have to think about how it impacts the Python level :)
My take on all this would be to work with NVRAM as block rather than single memory cells:
allocate a lock on the NVRAM block try: copy the block into DRAM run manipulations in DRAM block write back DRAM block finally: release lock on NVRAM block
so instead of worrying about a transaction failing while manipulating NVRAM, you only make sure that you can lock the NVRAM block and provide an atomic "block write to NVRAM" functionality.
Which is the reverse of what the library actually does. It copies the existing data into an NVRAM rollback log, and then makes the changes to the visible memory (that is, the changes are immediately visible to all threads). The rollback log is then used to undo those changes if the transaction fails. And yes, this means that you need locks around your persistent object updates when doing threaded programming, as I mentioned in my original post. I'm personally also interested in the STM-style case, since that allows you to write multi-access, potentially distributed, DB-like applications. However, that's not what this particular project is about. A language that is supporting persistent storage should support both models, I think, because both are useful for different applications. But the primary difference is what happens during a transaction, so at the language syntax level there is probably no difference. I guess that means there are two different classes of persistent memory from the application's point of view, even if they can be backed by the same physical memory: rollback persistent, and STM persistent. --David