For MarkH, Guido and the Windows experienced:
I've been reading Jeffrey Richter's "Advanced Windows" last night in order
to try understanding better why PyObject_NEW is implemented differently for
Windows. Again, I feel uncomfortable with this, especially now, when
I'm dealing with the memory aspect of Python's object constructors/desctrs.
Some time ago, Guido elaborated on why PyObject_NEW uses malloc() on the
user's side, before calling _PyObject_New (on Windows, cf. objimpl.h):
[Guido]
> I can explain the MS_COREDLL business:
>
> This is defined on Windows because the core is in a DLL. Since the
> caller may be in another DLL, and each DLL (potentially) has a
> different default allocator, and (in pre-Vladimir times) the
> type-specific deallocator typically calls free(), we (Mark & I)
> decided that the allocation should be done in the type-specific
> allocator. We changed the PyObject_NEW() macro to call malloc() and
> pass that into _PyObject_New() as a second argument.
While I agree with this, from reading chapters 5-9 of (a French copy of)
the book (translated backwards here):
5. Win32 Memory Architecture
6. Exploring Virtual Memory
7. Using Virtual Memory in Your Applications
8. Memory Mapped Files
9. Heaps
I can't find any radical Windows specificities for memory management.
On Windows, like the rest of the OSes, the (virtual & physical) memory
allocated for a process is common and seem to be accessible from all
DDLs involved in an executable.
Things like page sharing, copy-on-write, private process mem, etc. are
conceptually all the same on Windows and Unix.
Now, the backwards binary compatibility argument aside (assuming that
extensions get recompiled when a new Python version comes out),
my concern is that with the introduction of PyObject_NEW *and* PyObject_DEL,
there's no point in having separate implementations for Windows and Unix
any more (or I'm really missing something and I fail to see what it is).
User objects would be allocated *and* freed by the core DLL (at least
the object headers). Even if several DLLs use different allocators, this
shouldn't be a problem if what's obtained via PyObject_NEW is freed via
PyObject_DEL. This Python memory would be allocated from the Python's
core DLL regions/pages/heaps. And I believe that the memory allocated
by the core DLL is accessible from the other DLL's of the process.
(I haven't seen evidence on the opposite, but tell me if this is not true)
I thought that maybe Windows malloc() uses different heaps for the different
DLLs, but that's fine too, as long as the _NEW/_DEL symmetry is respected
and all heaps are accessible from all DLLs (which seems to be the case...),
but:
In the beginning of Chapter 9, Heaps, I read the following:
"""
...About Win32 heaps (compared to Win16 heaps)...
* There is only one kind of heap (it doesn't have any particular name,
like "local" or "global" on Win16, because it's unique)
* Heaps are always local to a process. The contents of a process heap is
not accessible from the threads of another process. A large number of
Win16 applications use the global heap as a way of sharing data between
processes; this change in the Win32 heaps is often a source of problems
for porting Win16 applications to Win32.
* One process can create several heaps in its addressing space and can
manipulate them all.
* A DLL does not have its own heap. It uses the heaps as part of the
addressing space of the process. However, a DLL can create a heap in
the addressing space of a process and reserve it for its own use.
Since several 16-bit DLLs share data between processes by using the
local heap of a DLL, this change is a source of problems when porting
Win16 apps to Win32...
"""
This last paragraph confuses me. On one hand, it's stated that all heaps
can be manipulated by the process, and OTOH, a DLL can reserve a heap for
personal use within that process (implying the heap is r/w protected for
the other DLLs ?!?). The rest of this chapter does not explain how this
"private reservation" is or can be done, so some of you would probably
want to chime in and explain this to me.
Going back to PyObject_NEW, if it turns out that all heaps are accessible
from all DLLs involved in the process, I would probably lobby for unifying
the implementation of _PyObject_NEW/_New and _PyObject_DEL/_Del for Windows
and Unix.
Actually on Windows, object allocation does not depend on a central,
Python core memory allocator. Therefore, with the patches I'm working on,
changing the core allocator would work (would be changed for real) only for
platforms other than Windows.
Next, ff it's possible to unify the implementation, it would also be
possible to expose and officialize in the C API a new function set:
PyObject_New() and PyObject_Del() (without leading underscores)
For now, due to the implementation difference on Windows, we're forced to
use the macro versions PyObject_NEW/DEL.
Clearly, please tell me what would be wrong on Windows if a) & b) & c):
a) we have PyObject_New(), PyObject_Del()
b) their implementation is platform independent (no MS_COREDLL diffs,
we retain the non-Windows variant)
c) they're both used systematically for all object types
--
Vladimir MARANGOZOV | Vladimir.Marangozov(a)inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
Recently, Moshe Zadka <moshez(a)math.huji.ac.il> said:
> Here's a reason: there shouldn't be changes we'll retract later -- we
> need to come up with the (more or less) right hierarchy the first time,
> or we'll do a lot of work for nothing.
I think I disagree here (hmm, it's probably better to say that I
agree, but I agree on a tangent:-). I think we can be 100% sure that
we're wrong the first time around, and we should plan for that.
One of the reasons why were' wrong is because the world is moving
on. A module that at this point in time will reside at some level in
the hierarchy may in a few years (or shorter) be one of a large family
and be beter off elsewhere in the hierarchy. It would be silly if it
would have to stay where it was because of backward compatability.
If we plan for being wrong we can make the mistakes less painful. I
think that a simple scheme where a module can say "I'm expecting the
Python 1.6 namespace layout" would make transition to a completely
different Python 1.7 namespace layout a lot less painful, because some
agent could do the mapping. This can either happen at runtime (through
a namespace, or through an import hook, or probably through other
tricks as well) or optionally by a script that would do the
translations.
Of course this doesn't mean we should go off and hack in a couple of
namespaces (hence my "agreeing on a tangent"), but it does mean that I
think Gregs idea of not wanting to change everything at once has
merit.
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen(a)oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
String objects have grown methods since 1.5.2. So it makes sense to
provide a class 'UserString' similar to 'UserList' and 'UserDict', so
that there is a standard base class to inherit from, if someone has the
desire to extend the string methods. What do you think?
Regards, Peter
--
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)
I've written up a list of things that need to get done before 1.6 is
finished. This is my vision of what needs to be done, and doesn't
have an official stamp of approval from GvR or anyone else. So it's
very probably wrong.
http://starship.python.net/crew/amk/python/1.6-jobs.html
Here's the list formatted as text. The major outstanding things at
the moment seem to be sre and Distutils; once they go in, you could
probably release an alpha, because the other items are relatively
minor.
Still to do
* XXX Revamped import hooks (or is this a post-1.6 thing?)
* Update the documentation to match 1.6 changes.
* Document more undocumented modules
* Unicode: Add Unicode support for open() on Windows
* Unicode: Compress the size of unicodedatabase
* Unicode: Write \N{SMILEY} codec for Unicode
* Unicode: the various XXX items in Misc/unicode.txt
* Add module: Distutils
* Add module: Jim Ahlstrom's zipfile.py
* Add module: PyExpat interface
* Add module: mmapfile
* Add module: sre
* Drop cursesmodule and package it separately. (Any other obsolete
modules that should go?)
* Delete obsolete subdirectories in Demo/ directory
* Refurbish Demo subdirectories to be properly documented, match
modern coding style, etc.
* Support Unicode strings in PyExpat interface
* Fix ./ld_so_aix installation problem on AIX
* Make test.regrtest.py more usable outside of the Python test suite
* Conservative garbage collection of cycles (maybe?)
* Write friendly "What's New in 1.6" document/article
Done
Nothing at the moment.
After 1.7
* Rich comparisons
* Revised coercions
* Parallel for loop (for i in L; j in M: ...),
* Extended slicing for all sequences.
* GvR: "I've also been thinking about making classes be types (not
as huge a change as you think, if you don't allow subclassing
built-in types), and adding a built-in array type suitable for use
by NumPy."
--amk
Are there any objections to including
try:
from cPickle import *
except:
pass
in pickle and
try:
from cStringIO import *
except:
pass
in StringIO?
-- ?!ng
"I'm not trying not to answer the question; i'm just not answering it."
-- Lenore Snell
Is there any reason to keep two seperate modules with simple-formatting
functions? I think pprint is somewhat more sophisticated, but in the
worst case, we can just dump them both in the same file (the only thing
would be that pprint would export "repr", in addition to "saferepr" (among
others).
(Just bumped into this in my reorg suggestion)
--
Moshe Zadka <mzadka(a)geocities.com>.
http://www.oreilly.com/news/prescod_0300.htmlhttp://www.linux.org.il -- we put the penguin in .com
Hey... just thought I'd drop off a description of the "formal" mechanism
that the ASF uses for voting since it has been seen here and there on this
group :-)
+1 "I'm all for it. Do it!"
+0 "Seems cool and acceptable, but I can also live without it"
-0 "Not sure this is the best thing to do, but I'm not against it."
-1 "Veto. And <HERE> is my reasoning."
Strictly speaking, there is no vetoing here, other than by Guido. For
changes to Apache (as opposed to bug fixes), it depends on where the
development is. Early stages, it is reasonably open and people work
straight against CVS (except for really big design changes). Late stage,
it requires three +1 votes during discussion of a patch before it goes in.
Here on python-dev, it would seem that the votes are a good way to quickly
let Guido know people's feelings about topic X or Y.
On the patches mailing list, the voting could actually be quite a useful
measure for the people with CVS commit access. If a patch gets -1, then
its commit should wait until reason X has been resolved. Note that it can
be resolved in two ways: the person lifts their veto (after some amount of
persuasion or explanation), or the patch is updated to address the
concerns (well, unless the veto is against the concept of the patch
entirely :-). If a patch gets a few +1 votes, then it can probably go
straight in. Note that the Apache guys sometimes say things like "+1 on
concept" meaning they like the idea, but haven't reviewed the code.
Do we formalize on using these? Not really suggesting that. But if myself
(and others) drop these things into mail notes, then we may as well have a
description of just what the heck is going on :-)
Cheers,
-g
--
Greg Stein, http://www.lyra.org/
MAL wrote:
>Andrew M. Kuchling" wrote:
>>
>> Paul Prescod writes:
>>>The new \N escape interpolates named characters within strings. For
>>>example, "Hi! \N{WHITE SMILING FACE}" evaluates to a string with a
>>>unicode smiley face at the end.
>>
>> Cute idea, and it certainly means you can avoid looking up Unicode
>> numbers. (You can look up names instead. :) ) Note that this means the
>> Unicode database is no longer optional if this is done; it has to be
>> around at code-parsing time. Python could import it automatically, as
>> exceptions.py is imported. Christian's work on compressing
>> unicodedatabase.c is therefore really important. (Is Perl5.6 actually
>> dragging around the Unicode database in the binary, or is it read out
>> of some external file or data structure?)
>
> Sorry to disappoint you guys, but the Unicode name and comments
> are *not* included in the unicodedatabase.c file Christian
> is currently working on. The reason is simple: it would add
> huge amounts of string data to the file. So this is a no-no
> for the core distribution...
>
Ok, now you're just being silly. Its possible to put the character names in
a separate structure so that they don't automatically get paged in with the
normal unicode character property data. If you never use it, it won't get
paged in, its that simple....
Looking up the Unicode code value from the Unicode character name smells
like a good time to use gperf to generate a perfect hash function for the
character names. Esp. for the Unicode 3.0 character namespace. Then you can
just store the hashkey -> Unicode character mapping, and hardly ever need to
page in the actual full character name string itself.
I haven't looked at what the comment field contains, so I have no idea how
useful that info is.
*waits while gperf crunches through the ~10,550 Unicode characters where
this would be useful*
Bill
Greetings!
We're working on integrating our own memory manager into our project
and the current challenge is figuring out how to make it play nice
with Python (and SWIG). The approach we're currently taking is to
patch 1.5.2 and augment the PyMem* macros to call external memory
allocation functions that we provide. The idea is to easily allow
the addition of third party memory management facilities to Python.
Assuming 1) we get it working :-), and 2) we sync to the latest Python
CVS and patch that, would this be a useful patch to give back to the
community? Has anyone run up against this before?
Thanks,
Jason Asbahr
Origin Systems, Inc.
jasbahr(a)origin.ea.com
Attached you find the latest update of the Unicode implementation.
The patch is against the current CVS version.
It includes the fix I posted yesterday for the core dump problem
in codecs.c (was introduced by my previous patch set -- sorry),
adds more tests for the codecs and two new parser markers
"es" and "es#".
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Only in CVS-Python/Doc/tools: anno-api.py
diff -u -rP -x *.o -x *.pyc -x Makefile -x *~ -x *.so -x add2lib -x pgen -x buildno -x config.* -x libpython* -x python -x Setup -x Setup.local -x Setup.thread -x hassignal -x Makefile.pre -x *.bak -x *.s -x DEADJOE -x Demo -x CVS CVS-Python/Lib/codecs.py Python+Unicode/Lib/codecs.py
--- CVS-Python/Lib/codecs.py Thu Mar 23 23:58:41 2000
+++ Python+Unicode/Lib/codecs.py Fri Mar 17 23:51:01 2000
@@ -46,7 +46,7 @@
handling schemes by providing the errors argument. These
string values are defined:
- 'strict' - raise an error (or a subclass)
+ 'strict' - raise a ValueError error (or a subclass)
'ignore' - ignore the character and continue with the next
'replace' - replace with a suitable replacement character;
Python will use the official U+FFFD REPLACEMENT
diff -u -rP -x *.o -x *.pyc -x Makefile -x *~ -x *.so -x add2lib -x pgen -x buildno -x config.* -x libpython* -x python -x Setup -x Setup.local -x Setup.thread -x hassignal -x Makefile.pre -x *.bak -x *.s -x DEADJOE -x Demo -x CVS CVS-Python/Lib/test/output/test_unicode Python+Unicode/Lib/test/output/test_unicode
--- CVS-Python/Lib/test/output/test_unicode Fri Mar 24 22:21:26 2000
+++ Python+Unicode/Lib/test/output/test_unicode Sat Mar 11 00:23:21 2000
@@ -1,5 +1,4 @@
test_unicode
Testing Unicode comparisons... done.
-Testing Unicode contains method... done.
Testing Unicode formatting strings... done.
Testing unicodedata module... done.
diff -u -rP -x *.o -x *.pyc -x Makefile -x *~ -x *.so -x add2lib -x pgen -x buildno -x config.* -x libpython* -x python -x Setup -x Setup.local -x Setup.thread -x hassignal -x Makefile.pre -x *.bak -x *.s -x DEADJOE -x Demo -x CVS CVS-Python/Lib/test/test_unicode.py Python+Unicode/Lib/test/test_unicode.py
--- CVS-Python/Lib/test/test_unicode.py Thu Mar 23 23:58:47 2000
+++ Python+Unicode/Lib/test/test_unicode.py Fri Mar 24 00:29:43 2000
@@ -293,3 +293,33 @@
assert unicodedata.combining(u'\u20e1') == 230
print 'done.'
+
+# Test builtin codecs
+print 'Testing builtin codecs...',
+
+assert unicode('hello','ascii') == u'hello'
+assert unicode('hello','utf-8') == u'hello'
+assert unicode('hello','utf8') == u'hello'
+assert unicode('hello','latin-1') == u'hello'
+
+assert u'hello'.encode('ascii') == 'hello'
+assert u'hello'.encode('utf-8') == 'hello'
+assert u'hello'.encode('utf8') == 'hello'
+assert u'hello'.encode('utf-16-le') == 'h\000e\000l\000l\000o\000'
+assert u'hello'.encode('utf-16-be') == '\000h\000e\000l\000l\000o'
+assert u'hello'.encode('latin-1') == 'hello'
+
+u = u''.join(map(unichr, range(1024)))
+for encoding in ('utf-8', 'utf-16', 'utf-16-le', 'utf-16-be',
+ 'raw_unicode_escape', 'unicode_escape', 'unicode_internal'):
+ assert unicode(u.encode(encoding),encoding) == u
+
+u = u''.join(map(unichr, range(256)))
+for encoding in ('latin-1',):
+ assert unicode(u.encode(encoding),encoding) == u
+
+u = u''.join(map(unichr, range(128)))
+for encoding in ('ascii',):
+ assert unicode(u.encode(encoding),encoding) == u
+
+print 'done.'
diff -u -rP -x *.o -x *.pyc -x Makefile -x *~ -x *.so -x add2lib -x pgen -x buildno -x config.* -x libpython* -x python -x Setup -x Setup.local -x Setup.thread -x hassignal -x Makefile.pre -x *.bak -x *.s -x DEADJOE -x Demo -x CVS CVS-Python/Misc/unicode.txt Python+Unicode/Misc/unicode.txt
--- CVS-Python/Misc/unicode.txt Thu Mar 23 23:58:48 2000
+++ Python+Unicode/Misc/unicode.txt Fri Mar 24 22:29:35 2000
@@ -715,21 +715,126 @@
These markers are used by the PyArg_ParseTuple() APIs:
- 'U': Check for Unicode object and return a pointer to it
+ "U": Check for Unicode object and return a pointer to it
- 's': For Unicode objects: auto convert them to the <default encoding>
+ "s": For Unicode objects: auto convert them to the <default encoding>
and return a pointer to the object's <defencstr> buffer.
- 's#': Access to the Unicode object via the bf_getreadbuf buffer interface
+ "s#": Access to the Unicode object via the bf_getreadbuf buffer interface
(see Buffer Interface); note that the length relates to the buffer
length, not the Unicode string length (this may be different
depending on the Internal Format).
- 't#': Access to the Unicode object via the bf_getcharbuf buffer interface
+ "t#": Access to the Unicode object via the bf_getcharbuf buffer interface
(see Buffer Interface); note that the length relates to the buffer
length, not necessarily to the Unicode string length (this may
be different depending on the <default encoding>).
+ "es":
+ Takes two parameters: encoding (const char *) and
+ buffer (char **).
+
+ The input object is first coerced to Unicode in the usual way
+ and then encoded into a string using the given encoding.
+
+ On output, a buffer of the needed size is allocated and
+ returned through *buffer as NULL-terminated string.
+ The encoded may not contain embedded NULL characters.
+ The caller is responsible for free()ing the allocated *buffer
+ after usage.
+
+ "es#":
+ Takes three parameters: encoding (const char *),
+ buffer (char **) and buffer_len (int *).
+
+ The input object is first coerced to Unicode in the usual way
+ and then encoded into a string using the given encoding.
+
+ If *buffer is non-NULL, *buffer_len must be set to sizeof(buffer)
+ on input. Output is then copied to *buffer.
+
+ If *buffer is NULL, a buffer of the needed size is
+ allocated and output copied into it. *buffer is then
+ updated to point to the allocated memory area. The caller
+ is responsible for free()ing *buffer after usage.
+
+ In both cases *buffer_len is updated to the number of
+ characters written (excluding the trailing NULL-byte).
+ The output buffer is assured to be NULL-terminated.
+
+Examples:
+
+Using "es#" with auto-allocation:
+
+ static PyObject *
+ test_parser(PyObject *self,
+ PyObject *args)
+ {
+ PyObject *str;
+ const char *encoding = "latin-1";
+ char *buffer = NULL;
+ int buffer_len = 0;
+
+ if (!PyArg_ParseTuple(args, "es#:test_parser",
+ encoding, &buffer, &buffer_len))
+ return NULL;
+ if (!buffer) {
+ PyErr_SetString(PyExc_SystemError,
+ "buffer is NULL");
+ return NULL;
+ }
+ str = PyString_FromStringAndSize(buffer, buffer_len);
+ free(buffer);
+ return str;
+ }
+
+Using "es" with auto-allocation returning a NULL-terminated string:
+
+ static PyObject *
+ test_parser(PyObject *self,
+ PyObject *args)
+ {
+ PyObject *str;
+ const char *encoding = "latin-1";
+ char *buffer = NULL;
+
+ if (!PyArg_ParseTuple(args, "es:test_parser",
+ encoding, &buffer))
+ return NULL;
+ if (!buffer) {
+ PyErr_SetString(PyExc_SystemError,
+ "buffer is NULL");
+ return NULL;
+ }
+ str = PyString_FromString(buffer);
+ free(buffer);
+ return str;
+ }
+
+Using "es#" with a pre-allocated buffer:
+
+ static PyObject *
+ test_parser(PyObject *self,
+ PyObject *args)
+ {
+ PyObject *str;
+ const char *encoding = "latin-1";
+ char _buffer[10];
+ char *buffer = _buffer;
+ int buffer_len = sizeof(_buffer);
+
+ if (!PyArg_ParseTuple(args, "es#:test_parser",
+ encoding, &buffer, &buffer_len))
+ return NULL;
+ if (!buffer) {
+ PyErr_SetString(PyExc_SystemError,
+ "buffer is NULL");
+ return NULL;
+ }
+ str = PyString_FromStringAndSize(buffer, buffer_len);
+ return str;
+ }
+
File/Stream Output:
-------------------
@@ -837,6 +942,7 @@
History of this Proposal:
-------------------------
+1.3: Added new "es" and "es#" parser markers
1.2: Removed POD about codecs.open()
1.1: Added note about comparisons and hash values. Added note about
case mapping algorithms. Changed stream codecs .read() and
Only in CVS-Python/Objects: .#stringobject.c.2.59
Only in CVS-Python/Objects: stringobject.c.orig
diff -u -rP -x *.o -x *.pyc -x Makefile -x *~ -x *.so -x add2lib -x pgen -x buildno -x config.* -x libpython* -x python -x Setup -x Setup.local -x Setup.thread -x hassignal -x Makefile.pre -x *.bak -x *.s -x DEADJOE -x Demo -x CVS CVS-Python/Python/getargs.c Python+Unicode/Python/getargs.c
--- CVS-Python/Python/getargs.c Sat Mar 11 10:55:21 2000
+++ Python+Unicode/Python/getargs.c Fri Mar 24 20:22:26 2000
@@ -178,6 +178,8 @@
}
else if (level != 0)
; /* Pass */
+ else if (c == 'e')
+ ; /* Pass */
else if (isalpha(c))
max++;
else if (c == '|')
@@ -654,6 +656,122 @@
break;
}
+ case 'e': /* encoded string */
+ {
+ char **buffer;
+ const char *encoding;
+ PyObject *u, *s;
+ int size;
+
+ /* Get 'e' parameter: the encoding name */
+ encoding = (const char *)va_arg(*p_va, const char *);
+ if (encoding == NULL)
+ return "(encoding is NULL)";
+
+ /* Get 's' parameter: the output buffer to use */
+ if (*format != 's')
+ return "(unkown parser marker combination)";
+ buffer = (char **)va_arg(*p_va, char **);
+ format++;
+ if (buffer == NULL)
+ return "(buffer is NULL)";
+
+ /* Convert object to Unicode */
+ u = PyUnicode_FromObject(arg);
+ if (u == NULL)
+ return "string, unicode or text buffer";
+
+ /* Encode object; use default error handling */
+ s = PyUnicode_AsEncodedString(u,
+ encoding,
+ NULL);
+ Py_DECREF(u);
+ if (s == NULL)
+ return "(encoding failed)";
+ if (!PyString_Check(s)) {
+ Py_DECREF(s);
+ return "(encoder failed to return a string)";
+ }
+ size = PyString_GET_SIZE(s);
+
+ /* Write output; output is guaranteed to be
+ 0-terminated */
+ if (*format == '#') {
+ /* Using buffer length parameter '#':
+
+ - if *buffer is NULL, a new buffer
+ of the needed size is allocated and
+ the data copied into it; *buffer is
+ updated to point to the new buffer;
+ the caller is responsible for
+ free()ing it after usage
+
+ - if *buffer is not NULL, the data
+ is copied to *buffer; *buffer_len
+ has to be set to the size of the
+ buffer on input; buffer overflow is
+ signalled with an error; buffer has
+ to provide enough room for the
+ encoded string plus the trailing
+ 0-byte
+
+ - in both cases, *buffer_len is
+ updated to the size of the buffer
+ /excluding/ the trailing 0-byte
+
+ */
+ int *buffer_len = va_arg(*p_va, int *);
+
+ format++;
+ if (buffer_len == NULL)
+ return "(buffer_len is NULL)";
+ if (*buffer == NULL) {
+ *buffer = PyMem_NEW(char, size + 1);
+ if (*buffer == NULL) {
+ Py_DECREF(s);
+ return "(memory error)";
+ }
+ } else {
+ if (size + 1 > *buffer_len) {
+ Py_DECREF(s);
+ return "(buffer overflow)";
+ }
+ }
+ memcpy(*buffer,
+ PyString_AS_STRING(s),
+ size + 1);
+ *buffer_len = size;
+ } else {
+ /* Using a 0-terminated buffer:
+
+ - the encoded string has to be
+ 0-terminated for this variant to
+ work; if it is not, an error raised
+
+ - a new buffer of the needed size
+ is allocated and the data copied
+ into it; *buffer is updated to
+ point to the new buffer; the caller
+ is responsible for free()ing it
+ after usage
+
+ */
+ if (strlen(PyString_AS_STRING(s)) != size)
+ return "(encoded string without "\
+ "NULL bytes)";
+ *buffer = PyMem_NEW(char, size + 1);
+ if (*buffer == NULL) {
+ Py_DECREF(s);
+ return "(memory error)";
+ }
+ memcpy(*buffer,
+ PyString_AS_STRING(s),
+ size + 1);
+ }
+ Py_DECREF(s);
+ break;
+ }
+
case 'S': /* string object */
{
PyObject **p = va_arg(*p_va, PyObject **);