[Python-Dev] pymalloc and overallocation
(unicodeobject.c,2.139,2.140 checkin)
Tim Peters
tim.one@comcast.net
Sat, 27 Apr 2002 01:01:03 -0400
[martin@v.loewis.de]
> That's what I mean (I'm *really* confused about memory family APIs,
> ever since everything changed :-)
Here's the in-depth course:
PyMem_xyz calls the platform malloc/realloc/free (fiddled for
x-platform uniformity in NULL and 0 handling)
PyObject_xyz calls pymalloc's malloc/realloc/free
and instead of a dozen layers of indirection we've now got crushingly
straightforward WYSIWYG preprocessor blocks like:
#ifdef WITH_PYMALLOC
#ifdef PYMALLOC_DEBUG
#define PyObject_MALLOC _PyObject_DebugMalloc
#define PyObject_Malloc _PyObject_DebugMalloc
#define PyObject_REALLOC _PyObject_DebugRealloc
#define PyObject_Realloc _PyObject_DebugRealloc
#define PyObject_FREE _PyObject_DebugFree
#define PyObject_Free _PyObject_DebugFree
#else /* WITH_PYMALLOC && ! PYMALLOC_DEBUG */
#define PyObject_MALLOC PyObject_Malloc
#define PyObject_REALLOC PyObject_Realloc
#define PyObject_FREE PyObject_Free
#endif
#else /* ! WITH_PYMALLOC */
#define PyObject_MALLOC PyMem_MALLOC
#define PyObject_REALLOC PyMem_REALLOC
#define PyObject_FREE PyMem_FREE
#endif /* WITH_PYMALLOC */
#define PyObject_Del PyObject_Free
#define PyObject_DEL PyObject_FREE
/* for source compatibility with 2.2 */
#define _PyObject_Del PyObject_Free
All the names you love are still there, it's just that most of them are
redundant now <wink>.
> ...
> I do think that the Unicode data should be managed by pymalloc as
> well.
Well, that largely depends on how big these suckers are. Calling
PyObject_XYZ adds real overhead if pymalloc can't handle the requested size:
all the overhead of the system routines, + the overhead of pymalloc figuring
out it can't handle it. I expect it's also not good to mix pymalloc with
custom free lists: you hold on to one object from a pymalloc pool, and it
prevents the entire pool from getting recycled for another size class. So
if you want to investigate using pymalloc more heavily for Unicode objects,
I suggest two things:
1. Get rid of the Unicode-specific free list.
2. Change the object layout to embed the str member storage, just as
PyStringObject does.
#1 is pretty localized, but #2 would require changing a lot of code.
> Of course, DecodeUTF8 would then raise the same issue: decoding
> UTF-8 doesn't know how many characters you'll get, either. This
> currently does not try to be clever, but allocates enough memory for
> the worst case.
I just put a patch up on SourceForge that's *less* clever, but shouldn't
waste any memory in the end. I expect you'll be happy with it, or rant
inconsolably. It's all the same to me <wink>.