[Python-ideas] Application awareness of memory storage classes

Wed May 18 13:44:12 EDT 2016

On Tue, 17 May 2016 21:39:07 -0700, Ethan Furman <ethan at stoneleaf.us> wrote:
> On 05/16/2016 05:35 PM, R. David Murray wrote:
> 
> > I'm currently working on a project for Intel involving Python and directly
> > addressable non-volatile memory.
> 
> Sounds cool.  The first thing it reminded me of was database transactions.

There are related concerns.  For nvml, we are concerned about the A and
the D of ACID, but not the C or the I.  (There was also some discussion of
a cloud-enabled ACID system using the equivalent of software transactional
memory, but that's not what nvml is focused on.)

To be clear about what I mean by Isolation not being involved: because
DAX makes VMRAM behave like RAM (except persistent), when you update
the data at a memory address in NVRAM, it is immediately visible to
all other threads.  Consistency in ACID terms is not something nvml
is addressing, that's more of an application level property.  So the
guarantees NVML is providing are only about the Atomicity and Durability.

> What are the expected speed comparisons?  I would imagine that 
> persistent RAM is going to be slower.

If I understand correctly the RAM itself is not *necessarily* slower.
However, providing the crash-proof atomicity requires bookkeeping overhead
that will slow down operations that need to be protected by a transaction
(which will be lots of things but by no means everything) compared to DRAM
access where you don't care about the atomicity in the face of crashes.
The goal is obviously to keep the overhead as small as possible :)

> How much would a typical configuration have?  Enough for all a program's 
> data?

Yes.  I'm sure systems will start with "smaller" amounts of DAX NVRAM
initially, and more and more will become common as costs drop.  But if I
understand correctly we're already talking about significant amounts.

In fact, I played around with the idea of just pointing python3's memory
allocation at nvram and running it, but that does not get you the
transaction level support that would allow crash recovery.

> What are the expected use-cases?  For example, loop variables might not 
> be a prime target for being stored in persistent RAM.

Probably not, but you might be surprised.  The main target would of
course be the data the program wants to preserve between runs, but one of
the examples at pmem.io is a game program whose state is all in nvram.
If the machine crashes in the middle of the game, upon restart you pick
up exactly where you left off, including all the positioning information
you might consider to be part of the "hot loop" part of the program.

> What are the advantages of using this persistent RAM directly as 
> variables vs. keeping the access as it's own step via the driver layer?

Speed.

--David