> Message: 7
> Date: Thu, 24 Feb 2000 10:23:05 -0500 (EST)
> From: Guido van Rossum <guido(a)cnri.reston.va.us>
> To: python-checkins(a)python.org
> Subject: [Python-checkins] CVS: python/dist/src/Objects listobject.c,2.63,2.64
>
> Update of /projects/cvsroot/python/dist/src/Objects
> In directory eric:/projects/python/develop/guido/src/Objects
>
> Modified Files:
> listobject.c
> Log Message:
> Made all list methods use PyArg_ParseTuple(), for more accurate
> diagnostics.
>
> *** INCOMPATIBLE CHANGE: This changes append(), remove(), index(), and
> *** count() to require exactly one argument -- previously, multiple
> *** arguments were silently assumed to be a tuple.
Not sure about remove(), index() and count(), but the change
to .append() will break *lots* of code !
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
I am just coding the translate method for Unicode objects and
have come along a design question that may have some importance
with resp. to speed and memory allocation size.
Currently, mapping tables map characters to Unicode characters
and vice-versa. Now the .translate method will use a different
kind of table: mapping integer ordinals to integer ordinals.
Question: What is more of efficient: having lots of integers
in a dictionary or lots of characters ?
Another aspect of this question is: the translate method
will be able to handle sequences *and* mappings because it
looks up integers which can be interpreted as indexes as well
as dictionary keys. The character mapping codec uses characters
as key and thus only allows dictionaries to be used (the reason
is that in some future version it should be possible to
map single characters to multiple characters or even combinations
to bnew combinations).
BTW, I dropped the deletions argument from the translate method:
it is not needed, since a mapping to None will have the same effect.
Note that not specifying a mapping causes the characters to be
copied as-is. This has the nice side-effect of grealty reducing
the mapping table's size.
Note that there will be no .maketrans() method. The same functionality
can easily be coded in Python if needed and doesn't fit into the
OO-style nature of string and Unicode objects anymore.
--
Something else that changed is the way .capitalize() works. The
Unicode version uses the Unicode algorithm for it (see TechRep. 13
on the www.unicode.org site). Here's the new doc string:
S.capitalize() -> unicode
Return a capitalized version of S, i.e. words start with title case
characters, all remaining cased characters have lower case.
Note that *all* characters are touched, not just the first one.
The change was needed to get it in sync with the .iscapitalized()
method which is based on the Unicode algorithm too.
Should this change be propogated to the string implementation ?
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
[Forwarded to this list with Phil Wadler's permission.]
> Greg Wilson wrote:
> Hi, Phil. I was talking to a couple of colleagues over the weekend,
> one of whom learned Miranda as his second language and never really
> recovered :-). I was wondering: if you could add one or two features
> of Haskell to Python, what would you choose? And why?
Philip Wadler <wadler(a)research.bell-labs.com> wrote:
Well, what I most want is typing. But you already know that.
Next after typing? Full lexical scoping for closures. I want to write:
fun x: fun y: x+y
Not:
fun x: fun y, x=x: x+y
Lexically scoped closures would be a big help for the embedding technique
I described [GVW: in a posting to the Software Carpentry discussion list,
archived at
http://software-carpentry.codesourcery.com/lists/sc-discuss/msg00068.html
which discussed how to build a flexible 'make' alternative in Python].
Next after closures? Disjoint sums. E.g.,
fun area(shape) :
switch shape:
case Circle(r):
return pi*r*r
case Rectangle(h,w):
return h*w
(I'm making up a Python-like syntax.) This is an alternative to the OO
approach. With the OO approach, it is hard to add area, unless you modify
the Circle and Rectangle class definitions. On the other hand, with
disjoint sums it is hard to add a new shape, unless you modify all the
existing switch statements for shapes. This is a well-known tradeoff, see
e.g., the discussion in my paper with Odersky on Pizza, or the ECOOP 98
paper by Felleisen and Krishnamurthi.
[GVW: the Pizza paper is available from:
http://cm.bell-labs.com/cm/cs/who/wadler/topics/gj.html
The Felleisen and Krishnamurthi paper is at:
http://www.cs.rice.edu/CS/PLT/Publications/#tr98-299
]
Hi!
In PR#214 Martin v. Loewis suggests a sizeof function as result of
request to python-help. I've followed the thread silently until now.
On platforms with virtual memory subsystem this is usually a not an
issue. On embedded systems and ancient OSes (like MS-Dos) it is often
useful, if applications can estimate how much memory their data consumes.
The sizeof() function proposed by Martin is only one possible
approach I can think of. Another approach would be encapsulationg the
'malloc/free'-logic into a wrapper, that traces all allocations und
deallocations in a special private 'usedmem' variable, which could
be queried by a function sys.usedmem() returning an integer.
Very often this is more convinient than a sizeof() function, because
you don't need to embed the summing into a maybe complicated nested
object data structure. Although 'usedmem' wouldn't return a precise
measure, it is often sufficient to estimate and it also should be
rather easy to implement.
We have implemented this approach years ago here in a Modula-2
based system, where we however had one great advantage: the Modula-2
Storage.DEALLOCATE procedure has a second parameter giving the size
of the data, which is missing from the signature of the C-library
free() function. So a wrapper around 'free()' would have to use an
additional hash or has to know something about the internals of the
underlying malloc library. The former of course would hurt portability.
Regards from Germany, Peter
--
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)
Hi Guido,
When a user does the following with standard Python:
tup = ()
for i in xrange(100000): tup = (tup, i)
del tup # ka-boom
He will get a core dump due to stack limitations.
Recently, I changed Stackless Python to be safe
for any recursive object built from
lists, tuples, dictionaries, tracebacks and frames.
The implementation is not Stackless Python dependant
and very efficient (for my eyes at least).
For efficiency, locality and minimum changes to five
modules, it is implemented as two embracing macroes
which are stuffed around the bodies of the deallocator
methods, that makes just 3-4 lines of change for
every module.
(Well, the macro *can* be expanded if you like that more)
I can submit patches, but please have a look at the example
below, to save me the time in case you don't like it.
It works great for SLP.
cheers - chris
--------------------------------------
Example of modified list deallocator:
/* Methods */
static void
list_dealloc(op)
PyListObject *op;
{
int i;
Py_TRASHCAN_SAFE_BEGIN(op)
if (op->ob_item != NULL) {
/* Do it backwards, for Christian Tismer.
There's a simple test case where somehow this reduces
thrashing when a *very* large list is created and
immediately deleted. */
i = op->ob_size;
while (--i >= 0) {
Py_XDECREF(op->ob_item[i]);
}
free((ANY *)op->ob_item);
}
free((ANY *)op);
Py_TRASHCAN_SAFE_END(op)
}
This is the original 1.5.2+ code, with two macro lines added.
--------------------------------------
Here the macro code (which may of course be expanded)
#define PyTrash_UNWIND_LEVEL 50
#define Py_TRASHCAN_SAFE_BEGIN(op) \
{ \
++_PyTrash_delete_nesting; \
if (_PyTrash_delete_nesting < PyTrash_UNWIND_LEVEL) { \
#define Py_TRASHCAN_SAFE_END(op) \
;} \
else { \
if (!_PyTrash_delete_later) \
_PyTrash_delete_later = PyList_New(0); \
if (_PyTrash_delete_later) \
PyList_Append(_PyTrash_delete_later, (PyObject *)op); \
} \
--_PyTrash_delete_nesting; \
while (_PyTrash_delete_later && _PyTrash_delete_nesting <= 0) {
\
PyObject *shredder = _PyTrash_delete_later; \
_PyTrash_delete_later = NULL; \
++_PyTrash_delete_nesting; \
Py_DECREF(shredder); \
--_PyTrash_delete_nesting; \
} \
} \
extern DL_IMPORT(int) _PyTrash_delete_nesting;
extern DL_IMPORT(PyObject *) _PyTrash_delete_later;
--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
I didn't have the heart to keep pestering Moshe, so I delayed a bit on the
latest round (and then left town for a few days). As a result, I didn't
get a chance to comment on this last patch.
There is one problem, and some formatting nits...
On Mon, 28 Feb 2000, Guido van Rossum wrote:
>...
> + static int instance_contains(PyInstanceObject *inst, PyObject *member)
> + {
> + static PyObject *__contains__;
> + PyObject *func, *arg, *res;
> + int ret;
> +
> + if(__contains__ == NULL) {
"Standard Guido Formatting" requires a space between the "if" and the "(".
if, for, while, etc are not functions... they are language constructs.
>...
> + if(PyObject_Cmp(obj, member, &cmp_res) == -1)
> + ret = -1;
> + if(cmp_res == 0)
> + ret = 1;
> + Py_DECREF(obj);
> + if(ret)
> + return ret;
I had suggested to Moshe to follow the logic in PySequence_Contains(), but
he wanted to use PyObject_Cmp() instead. No biggy, but the above code has
a bug:
PyObject_Cmp() does *not* guarantee a value for cmp_res if it returns -1.
Therefore, it is possible for cmp_res to be zero despite an error being
returned from PyObject_Cmp. Thus, you get a false-positive hit.
IMO, this section of code should read:
cmp_res = PyObject_Compare(obj, member);
Py_XDECREF(obj);
if (cmp_res == 0)
return 1;
if (PyErr_Occurred())
return -1;
The "ret" variable becomes unused and can be deleted. Oh! Just noted that
"ret" is declared twice; one hiding the declaration of the other. With the
above change and deletion of the inner "ret", then the hiding declaration
problem is also fixed.
Cheers,
-g
--
Greg Stein, http://www.lyra.org/
On comp.lang.python, "Juergen A. Erhard" <jae(a)ilk.de> wrote about
cursesmodule:
> Why two versions? Did Oliver forget to submit his patches to Guido
> (et al)? Or did Guido not accept them? If so, why not?
>
> What needs to be done to synchronize the canonical Python and the
> Python RPMs?
For python-dev readers: Oliver Andrich's Python RPMs contain his
enhanced cursesmodule, which supports many ncurses features. The
cursesmodule in the Python distribution supports only plain curses.
Question: what should be done about this?
The problem is that Oliver's enhanced module probably won't work on
systems that support only BSD curses. I haven't verified this,
though. On the other hand, ncurses implements the SYSV curses API,
and maybe there are no platforms left that only have plain curses.
Options:
1) Forget about it and leave things as they are.
2) Include the ncurses version of the module, backward compatibility
be damned.
3) Split the curses module out of the standard distribution, and
distribute it separately; users then download the plain or ncurses
version as they see fit.
4) Attempt to make patches for Oliver's module that will make it work
with plain curses.
I don't like #1; if the code is going to be unmaintained in the
future, why leave it in at all? #2 might be OK, if it's the case that
the SYSV curses API is widespread these days; is it? I'd be willing
to take a crack at #4, but have no idea where I could find a system
with only plain curses. (Apparently OpenBSD, at least, includes the
old BSD curses as libocurses.)
--
A.M. Kuchling http://starship.python.net/crew/amk/
When a man tells you that he got rich through hard work, ask him *whose*?
-- Don Marquis
Hi everybody,
As you may have noticed, the latest Unicode snapshot
contains a large number of new codecs. Most of them are
based on a generic mapping codec which makes adding
new codecs a very simple (even automated) task.
I've gotten some feedback on the compatibility of
the JPython Unicode implementation (actually the underlying
Java one) and the new CPython code. Finn Bock mentioned
that Java uses a slightly different naming scheme and
also has some differences in the code-page-to-Unicode
mappings.
* Could someone provide a list of all default code pages
and other encodings that Java supports ? It would be
ideal to provide the same set for CPython, IMHO.
So far I've got these encodings:
cp852.py iso_8859_5.py
cp855.py iso_8859_6.py
ascii.py cp856.py iso_8859_7.py
charmap.py cp857.py iso_8859_8.py
cp037.py cp860.py iso_8859_9.py
cp1006.py cp861.py koi8_r.py
cp1250.py cp862.py latin_1.py
cp1251.py cp863.py mac_cyrillic.py
cp1252.py cp864.py mac_greek.py
cp1253.py cp865.py mac_iceland.py
cp1254.py cp866.py mac_latin2.py
cp1255.py cp869.py mac_roman.py
cp1256.py cp874.py mac_turkish.py
cp1257.py iso_8859_10.py raw_unicode_escape.py
cp1258.py iso_8859_13.py unicode_escape.py
cp424.py iso_8859_14.py unicode_internal.py
cp437.py iso_8859_15.py utf_16.py
cp737.py iso_8859_2.py utf_16_be.py
cp775.py iso_8859_3.py utf_16_le.py
cp850.py iso_8859_4.py utf_8.py
Encoding names map to these module names in the following
way:
1. convert all hyphens to underscores
2. convert all chars to lowercase
3. apply an alias dictionary to the resulting name
Thus u"abc".encode('KOI8-R') and u"abc".encode('koi8_r')
will result in the same codec being used.
* There's also another issue: code pages with names cpXXXX
come from two sources: IBM and MS. Unfortunately, some of
these pages don't match even though they carry the same name.
Could someone verify whether the included maps work on
Windows, DOS and Mac platforms as intended ? (Finn reported
some divergence between the Java view of things and the
maps I created from the ftp.unicode.org site ones.)
Thanks,
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
> This patch for Python 1.52 , /Parser/myreadline.c on QNX using Watcom
> C++.
> Readline does not work properly .
> Using QNX input_line function instead of Linux readline.
> ------------------------------------------------------------
> 30,38d29
> <
> < #ifdef __QNX__
> < p = input_line( fp, buf, len );
> < if( p ) {
> < int n = strlen(p);
> < p[n] = '\n';
> < p[n+1] = 0;
> < }
> < #else
> 40d30
> < #endif
> -------------------------------------------------------------
I seem to recall that this came up recently but I don't remember
where. Can anybody jog my memory? What did we decide in the end?
--Guido van Rossum (home page: http://www.python.org/~guido/)