That's incredibly interesting. I've never used mmap before. 

However, there's a problem.

I did a few experiments with mmap now, this is the latest:

path = pathlib.Path(r'P:\huge_file')

with'r') as file:
    mmap = mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ)
    for match in re.finditer(b'.', mmap):

The file is 338GB in size, and it seems that Python is trying to load it into memory. The process is now taking 4GB RAM and it's growing. I saw the same behavior when searching for a non-existing match.

Should I open a Python bug for this? 

On Sun, Oct 7, 2018 at 7:49 PM <> wrote:
On 18-10-07 16.15, Ram Rachum wrote:
 > I tested it now and indeed bytes patterns work on memoryview objects.
 > But how do I use this to scan for patterns through a stream without
 > loading it to memory?

An mmap object is one of the things you can make a memoryview of,
although looking again, it seems you don't even need to, you can
just the mmap object directly.'ing the mmap object means the operating system takes care of
the streaming for you, reading in parts of the file only as necessary.

regards, Anders

Python-ideas mailing list
Code of Conduct: