[Python-Dev] Mixing memory management APIs
M.-A. Lemburg
mal@lemburg.com
Thu, 07 Feb 2002 09:55:11 +0100
Neal Norwitz wrote:
>
> "M.-A. Lemburg" wrote:
> >
> > Neal Norwitz wrote:
> >
> > > "M.-A. Lemburg" wrote:
> > >
> > >
> > >>I've checked in a patch for the UTF-8 codec problem. Could you
> > >>try Purify against the CVS version ?
> > >>
> > >
> > > with-pymalloc or without or both?
> >
> > Both if possible -- the leakage showed up with pymalloc AFAIR :-)
>
> There is a lot of data and it's very hard to follow,
> but I'm trying to provide as much info as I can.
> Let me know how I can make this info easier to use.
>
> Here is a summary:
>
> * I'm using gcc version 2.95.3, on Solaris 8, Purify 2002.
>
> * The new patches don't fix all the problems, but it may
> reduce the problems (I'm not sure). I think there were
> 13k errors on build before, it's 5.5k now.
>
> * test_unicodedata fails:
> *** mismatch between line 3 of expected output and
> line 3 of actual output:
> - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
> + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321
Hmm, I did run test_unicode, but forgot test_unicodedata. Now, looking
at test_unicodedata.py it produces loads of these unpaired Unicode
surrogates and then tries to encode them using UTF-8. Since the
UTF-8 previously produced wrong results for these, I guess I'll have
to recreate the test output.
> * Purify now has 2 UMRs now w/o pymalloc, but they are in
> fwrite() and contain no usable stack trace.
>
> * It's probably best to try using Electric Fence and/or dbmalloc.
> This may give better results than Purify.
>
> * There is a warning from sre.h that may be significant:
> Modules/sre.h:24: warning: `SRE_CODE' redefined
> Modules/sre.h:19: warning: this is the location
> of the previous definition
>
> I'll try some more things to see if I can get better info.
>
> Neal
> --
>
> bash-2.03$ ./configure --with-pymalloc --enable-unicode=ucs4
> bash-2.03$ make PURIFY=purify
>
> ---> 5542 errors
> Free Memory Read, Array Bounds Read, and Uninit Memory Read errors
> at lines unicodeobject.c:2214 & 2875
> (both are bogus lines)
>
> 2214 is in: PyUnicode_TranslateCharmap()
> 2875 is in: split_char()
Hmm, I'll have to look at this one...
> bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \
> test_unicode_file.py test_unicodedata.py
> test_unicode
> test test_unicode crashed -- exceptions.UnicodeError: UTF-8 decoding error: illegal encoding
That's strange, because at least on my machine, test_unicode runs
through just fine. Could you run the test by hand, so that the error
location
can be localized ?
> test_unicode_file
> test_unicodedata
> test test_unicodedata produced unexpected output:
> **********************************************************************
> *** mismatch between line 3 of expected output and line 3 of actual output:
> - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
> + Methods: 3051e6d4d117133c3e36a5c22b3a1ae362474321
> **********************************************************************
See above.
> 1 test OK.
> 2 tests failed:
> test_unicode test_unicodedata
>
> --------------------------------------------------------------------
>
> Without purify, test_unicode completed successfully, but unicodedata
> produced the same results.
>
> The errors produced in purify for these 3 tests were 99745.
> The errors were in the same places as for the build step.
>
> --------------------------------------------------------------------
>
> bash-2.03$ make clean
> bash-2.03$ ./configure --enable-unicode=ucs4
> bash-2.03$ make PURIFY=purify
>
> bash-2.03$ ./python -E -tt Lib/test/regrtest.py test_unicode.py \
> test_unicode_file.py test_unicodedata.py
> test test_unicodedata produced unexpected output:
> **********************************************************************
> *** mismatch between line 3 of expected output and line 3 of actual output:
> - Methods: 6c7a7c02657b69d0fdd7a7d174f573194bba2e18
> + Methods: 84b72943b1d4320bc1e64a4888f7cdf62eea219a
> **********************************************************************
> 2 tests OK.
> 1 test failed:
> test_unicodedata
>
> --------------------------------------------------------------------
>
> Purify did have 2 UMRs, but both contain almost no information:
>
> UMR: Uninitialized memory read
> This is occurring while in:
> _write [libc.so.1]
> _xflsbuf [libc.so.1]
> _fflush_u [libc.so.1]
> fseek [libc.so.1]
> *unknown func* [pc=0xe417c]
> *unknown func* [pc=0xe4db4]
> *unknown func* [pc=0xe64c4]
> *unknown func* [pc=0xe5cf0]
> *unknown func* [pc=0xe5524]
> *unknown func* [pc=0xe58a0]
> *unknown func* [pc=0x160464]
> *unknown func* [pc=0x159b64]
> Reading 3609 bytes from 0x6a2fcc in the heap (4 bytes at 0x6a3706 uninit).
> Address 0x6a2fcc is 4 bytes into a malloc'd block at 0x6a2fc8 of 8200 bytes.
> This block was allocated from:
> do_mkvalue [modsupport.c:243]
> _findbuf [libc.so.1]
> _wrtchk [libc.so.1]
> _flsbuf [libc.so.1]
> putc [libc.so.1]
> *unknown func* [pc=0xe8b9c]
> *unknown func* [pc=0xed794]
> *unknown func* [pc=0xe4104]
> *unknown func* [pc=0xe4db4]
> *unknown func* [pc=0xe64c4]
> *unknown func* [pc=0xe5cf0]
> *unknown func* [pc=0xe5524]
>
> --------------------------------------------------------------------
>
> UMR: Uninitialized memory read
> This is occurring while in:
> _write [libc.so.1]
> _xflsbuf [libc.so.1]
> _fwrite_unlocked [libc.so.1]
> fwrite [libc.so.1]
> *unknown func* [pc=0xeaa50]
> *unknown func* [pc=0xeadf4]
> *unknown func* [pc=0xeb3c8]
> *unknown func* [pc=0xed7e8]
> *unknown func* [pc=0xe411c]
> *unknown func* [pc=0xe4db4]
> *unknown func* [pc=0xe64c4]
> *unknown func* [pc=0xe5cf0]
> Reading 8192 bytes from 0x79d88c in the heap (4 bytes at 0x79de8d uninit).
> Address 0x79d88c is 4 bytes into a malloc'd block at 0x79d888 of 8200 bytes.
> This block was allocated from:
> do_mkvalue [modsupport.c:243]
> _findbuf [libc.so.1]
> _wrtchk [libc.so.1]
> _flsbuf [libc.so.1]
> putc [libc.so.1]
> *unknown func* [pc=0xe8b9c]
> *unknown func* [pc=0xed794]
> *unknown func* [pc=0xe4104]
> *unknown func* [pc=0xe4db4]
> *unknown func* [pc=0xe64c4]
> *unknown func* [pc=0xe5cf0]
> *unknown func* [pc=0xe5524]
Thanks,
--
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting: http://www.egenix.com/
Python Software: http://www.egenix.com/files/python/