[pypy-dev] Vectorizing numpy traces

Wed Feb 25 15:20:47 CET 2015

Hi Vincent,
I was aware of the FAQ item (my similar question long ago may have helped put it in the FAQ ;-)
AIUI the main issue is re-establishing the memory mapping, which I think could be re-established
by mmap-ing the saved files, if those files were created through mmap in the first place (along
with what lsofs might show at checkpoint time.

But in order to capture memory, malloc and the gc would have to be reworked to operate on memory by appending
zeroes to mmap-ed files serving as memory pools. If the program is running single threaded at the time the
checkpoint method is called, there wouldn't be an issue of restoring blocked multiple threads.

I would guess the remaining things would be the state of open files, but IWT their state could be saved and
reopens and seeks could be done on resume start-up.

A question would be if the jit discards warmed-up code so that it would not be seen on resume, but how would
it know there wasn't going to be an ordinary return from the checkpointing call?

With multiple mmap-ed files serving as memory pools, maybe they could even be put in a hash-identified directory
and gzipped for re-setup by the ELF resumption stub.

The idea for an elf stub would be to write a c program with ELF data space such that you could copy it and append
resumption data to result in data space seen by the resuming program. Something on the idea of peCOFF boot stub
for the linux kernel, but just for the individual pypy-warm-resume (at a minimum the gzipped mmap files archive name if
the latter is not actually appended to the stup program).

My hunch is that between the stackless guy Christian and Armin, they could figure it out ;-)
Maybe it's worth a re-think, if only to say "no, we really mean no" in the FAQ ;-)

Regards,
Bengt Richter

On 02/25/2015 07:14 AM Vincent Legoll wrote:
> Hello Bengt,
>
> If I'm not mistaken I think this FAQ item should at least partly answer
> your question :
>
> http://pypy.readthedocs.org/en/latest/faq.html#couldn-t-the-jit-dump-and-reload-already-compiled-machine-code
>
> Regards
>
> On Wed, Feb 25, 2015 at 1:12 AM, Bengt Richter <bokr at oz.net> wrote:
>
>> On 02/24/2015 11:17 PM Maciej Fijalkowski wrote:
>>
>>> Hi Richard.
>>>
>>>
>>> I will respond inline
>>>
>>> On Tue, Feb 24, 2015 at 8:18 PM, Richard Plangger <rich at pasra.at> wrote:
>>>
>>>> hi,
>>>>
>>> [...]
>>
>>>
>>>> (1) Is there a better way to get loops hot?
>>>>
>>>
>>> no, not really (you can lower the threshold though, see pypy --jit
>>> help for details, only global)
>>>
>>>
>>>>   PMJI, but I am curious if it would be possible to change pypy to use
>> mmap'd files for all memory allocations for heaps and stacks
>> so that the state of hotness and jit state could be captured.
>>
>> Files could be virtually large, since UIAM physical allocation does
>> not occur until write, or at least is controllable.
>>
>> The idea then would be to write programs with a warm-up prelude function,
>> and then have a checkpointing module with a method that could write a
>> specially stubbed ELF file along with all the file data, so that the ELF
>> would be an executable whose _start would get back to the checkpoint module
>> where everything would be restored as it was checkpointed, and execution
>> would continue as if just returning from the call to the checkpointing
>> method,
>> which would be after the forced warmup prelude.
>>
>> Sorry if I am intruding.
>> Regards,
>> Bengt Richter
>>
>>
>>
>>
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> https://mail.python.org/mailman/listinfo/pypy-dev
>>
>
>
>
>
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>