memory leak with re.match
Mayling ge
maylinge0903 at gmail.com
Wed Jul 5 04:36:04 EDT 2017
Sorry. The code here is just to describe the issue and is just pseudo
code, please forgive some typo. I list out lines because I need line
context.
Sent from Mail Master
On 07/05/2017 15:52, [1]Albert-Jan Roskam wrote:
From: Python-list
<python-list-bounces+sjeik_appie=hotmail.com at python.org> on behalf of
Mayling ge <maylinge0903 at gmail.com>
Sent: Tuesday, July 4, 2017 9:01 AM
To: python-list
Subject: memory leak with re.match
Hi,
My function is in the following way to handle file line by line.
There are
multiple error patterns defined and need to apply to each line.
I use
multiprocessing.Pool to handle the file in block.
The memory usage increases to 2G for a 1G file. And stays in 2G even
after
the file processing. File closed in the end.
If I comment out the call to re_pat.match, memory usage is
normal and
keeps under 100Mb.
am I using re in a wrong way? I cannot figure out a way to fix the
memory
leak. And I googled .
def line_match(lines, errors)
<snip>
lines = list(itertools.islice(fo, line_per_proc))
===> do you really need to listify the iterator?
if not lines:
break
result = p.apply_async(line_match, args=(errors, lines))
===> the signature of line_match is (lines, errors), in args you do
(errors, lines)
References
Visible links
1. mailto:sjeik_appie at hotmail.com
More information about the Python-list
mailing list