On 27.10.12 03:05, philip.jenvey wrote:
> changeset: 79953:74d65c746f63
> branch: 3.2
> parent: 79941:eb999002916c
> user: Philip Jenvey <pjenvey(a)underboss.org>
> date: Fri Oct 26 17:01:53 2012 -0700
> bounds check for bad data (thanks amaury)
> + if (strlen(p) > 2 &&
First, it produces compiler warning:
Python/codecs.c: In function ‘PyCodec_SurrogatePassErrors’:
Python/codecs.c:794: warning: pointer targets in passing argument 1 of ‘strlen’ differ in signedness
/usr/include/string.h:397: note: expected ‘const char *’ but argument is of type ‘unsigned char *’
Second, it slowdown the code to 10%:
$ ./python-orig -m timeit -s 'b=b"\xed\xa0\xa0"+b"x"*10000' 'b.decode("utf-8", "surrogatepass")'
100000 loops, best of 3: 12.2 usec per loop
$ ./python -m timeit -s 'b=b"\xed\xa0\xa0"+b"x"*10000' 'b.decode("utf-8", "surrogatepass")'
100000 loops, best of 3: 13.3 usec per loop
I suggest to use the followed code instead:
if (PyBytes_GET_SIZE(object) - start >= 3 &&
Hello Gang, I am fresh to this list so, hello PYC gurus!
I have been getting into
the Python C API over the past few days and found it it to be a
wonderful framework with
a solid design. There are many questions I would like to pose and
answer in this community and
look forward to a productive relationship. Here are a few question I have:
1. Is it appropriate to post usage questions regarding the Python C
API to this list like the questions below?
2. Is there any examples of a universal setter function for
`PyGetSetDef`? The docs mention using
the closure pointer from the last element in the `PyGetSetDef`
structures. I am finding it hard
to get a grasp on how this is implemented. I understand why we use it
and what it does and
need a more modular setter function for a large object.
*From the docs* [http://docs.python.org/extending/newtypes.html]
> `The getter function is passed a Noddy object and a “closure”, which is void pointer. In this case, the closure is ignored. (The closure supports an advanced usage in which definition data is passed to the getter and setter. This could, for example, be used to allow a single set of getter and setter functions that decide the attribute to get or set based on data in the closure.)`
3. Is it customary to use header files when building larger Python
modules/objects? I was successful in prototyping the source files,
but unable to get the module working after running a distutils
install. In short, is it encouraged to use header files?
Bust0ut, Surgemcgee: Systems Engineer ---
The Montreal-Python user group would like to host a bug day on October
27 (to be confirmed) at a partner university in Montreal. It would be
cool to do a bug day on IRC like we used to (and in other physical
locations if people want to!) to get new contributors and close bugs.
What do you think?
I forked CPython repository to work on my "split unicodeobject.c" project:
The result is 10 files (included the existing unicodeobject.c):
This is just a proposition (and work in progress). Everything can be changed :-)
"unicodenew.c" is not a good name. Content of this file may be moved
Some files may be merged again if the separation is not justified.
I don't like the "unicode" prefix for filenames, I would prefer a new directory.
Shorter files are easier to review and maintain. The compilation is
faster if only one file is modified.
The MBCS codec requires windows.h. The whole unicodeobject.c includes
it just for this codec. With the split, only unicodeoscodecs.c
includes this file.
The MBCS codec needs also a "winver" variable. This variable is
defined between the BLOOM filter and the unicode_result_unchanged()
function. How can you explain how these things are sorted? Where
should I add a new function or variable? With the split, the variable
is now defined very close to where is it used. You don't have to
scroll 7000 lines to see where it is used.
If you would like to work on a specific function, you don't have to
use the search function of your editor to skip thousands to lines. For
example, the 18 functions and 2 types related to the charmap codec are
now grouped into one unique and short C file.
It was already possible to extend and maintain unicodeobject.c (some
people proved it!), but it should now be much simpler with shorter
Note: unicodeobject.c is also composed by the huge stringlib library
(4000 lines), which is shared with the bytes type.
Private macros and prototype of private functions.
Many unicode_xxx() functions has been renamed to _PyUnicode_xxx() to
be able to reuse them in different files.
Functions to create a new Unicode string (PyUnicode_New), convert
from/to UCS4 and wchar_t*, resize a string. The ugly part of the PEP
find, replace, compare, split, fill, etc.
"str" type with all methods, _string module and unicodeiter type.
PyUnicode_FromFormat() and PyUnicode_Format()
Text codecs for Python Unicode strings:
- PyUnicode_DecodeRawUnicodeEscape(), PyUnicode_AsRawUnicodeEscapeString()
- PyUnicode_DecodeLatin1(), PyUnicode_AsLatin1String()
- many helpers for other codecs
Character Mapping Codec:
Operating system codecs: MBCS codec, locale (FS) codec => FS encode/decode.
UTF-7/8/16/32 codecs and ASCII decoder.
Legacy and deprecated Unicode API: Py_UNICODE type.
I am Jose Figueroa from Mexico. I work usually with C/C++, Ruby and PHP
(yeah I know =( ) but I will start again with python because I got some
free time after finishing a project.
I gonna use python for my Master Degree in Computer Sciences.
-------- Original Message --------
Subject: Python 3.3 can't sort memoryviews as they're unorderable
Date: Sun, 21 Oct 2012 12:24:32 +0100
From: Mark Lawrence <breamoreboy(a)yahoo.co.uk>
http://docs.python.org/dev/whatsnew/3.3.html states "memoryview
comparisons now use the logical structure of the operands and compare
all array elements by value". So I'd have thought that you should be
able to compare them and hence sort them, but this is the state of play.
Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32
bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> memoryview(bytearray(range(5))) == memoryview(bytearray(range(5)))
>>> memoryview(bytearray(range(5))) != memoryview(bytearray(range(5)))
>>> memoryview(bytearray(range(5))) < memoryview(bytearray(range(5)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: memoryview() < memoryview()
Okay then, let's subclass memoryview to provide the functionality.
>>> class Test(memoryview):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type 'memoryview' is not an acceptable base type
gives examples of equality comparisons and there was nothing that I
could see in PEP3118 to explain the rationale behind the lack of other
comparisons. What have I missed?
Nobody on the main Python ml could answer this so can someone please
explain the background to how memoryviews work in this instance as I'm
I've received three messages in the past hour from mailman at
python.org notifying me of various attempts to receive a password
reminder or to remove me from python-dev. I hope they don't succeed.
--Guido van Rossum (python.org/~guido)
The FAQ has this weird statement:
“This specification does not have an opinion on how you should organize
your code. The .data directory is just a place for any files that are
not normally installed inside site-packages or on the PYTHONPATH.”
But, say, if I want to install some init script to /etc/init.d by using
distutils' data_files argument:
How is it stored and represented by the wheel format?