Mailman 3 September 2001 - Python-Dev

Preventing PyEval_AcquireLock deadlock
by Robin Dunn Sept. 14, 2001

Sept. 14, 2001

Is there an easy way in the API to check if the current thread already has the interpreter lock so I can avoid calling PyEval_AcquireLock again? If so, is it available all the way back to 1.5.2? -- Robin Dunn Software Craftsman robin(a)AllDunn.com Java give you jitters? http://wxPython.org Relax with wxPython!

3 4

A draft PEP for a new memory model
by Paul Barrett Sept. 14, 2001

Sept. 14, 2001

The following is the beginnings of a PEP for a new memory model for Python. It currently contains only the motivation section and a description of a preliminary design. I'm submitting the PEP in its current form to get a feel for whether or not I should pursue this proposal and to find out if I am overlooking any details that would make it incompatible with Python's core implementation, i.e. implementing it would cause too much of an affect on Python's performance. I do plan to implement … [View More]something along these lines, but may have to change my approach if I hear comments about this PEP to the contrary. Cheers, Paul PEP: XXX Title: A New Memory Management Model for Python Version: $Revision: 1.3 $ Last-Modified: $Date: 2001/08/20 23:59:26 $ Author: barrett(a)stsci.edu (Paul Barrett) Status: Draft Type: Standards Track Created: 05-Sep-2001 Python-Version: 2.3 Post-History: Replaces: PEP 42 Abstract This PEP proposes a new memory management model to provide better support for the various types of memory found in modern operating systems. The proposed model separates the memory object from its access method. In simplest terms, memory objects only allocate memory, while access objects only provide access to that memory. This separation allows various types of memory to share a common interface or access object and vice versa. Motivation There are three sequence objects which share similar interfaces, but have different intended uses. The first is the indispensable 'string' object. A 'string' is an immutable sequence of characters and supports slicing, indexing, concatenation, replication, and related string-type operations. The second is the 'array' object. Like a 'list', it is a mutable sequence and supports slicing, indexing, concatenation, and replication, but its values are constrained to one of several basic types, namely characters, integers, and floating point numbers. This constraint enables efficient storage of the values. The third object is the 'buffer' which behaves similar to a string object at the Python programming level: it supports slicing, indexing, concatenation, and related string-like operations. However, its data can come from either a block of memory or an object that exports the buffer interface, such as 'mmap', the memory-mapped file object which is its prime justification. Each object has been used at one time or other as a way of allocating read-write memory from the heap. The 'string' object is often used at the C programming level because it is a standard Python object, but its use goes counter to its intended behavior of being immutable. The preferred way of allocating such memory is the 'array' object, but its insistence on returning a representation of itself for both the 'repr' and 'str' methods makes it cumbersome to use. In addition, the use of a 'string' as an initializer during 'array' creation is inefficient, because the memory is temporarily allocated twice, once for the 'string' and once for the 'array'. This is particularly onerous when allocating tens of megabytes of memory. The 'buffer' object also has its problems, some of which have been discussed on python-dev. Some of the more important ones are: (1) the 'buffer' object always returns a read-only 'buffer', even for read-write objects. This is apparently a bug in the 'buffer' object, which is fixable. (2) The buffer API provides no guarantee about the lifetime of the base pointer - even if the 'buffer' object holds a reference to the base object, since there is no locking mechanism associated with the base pointer. For example, if the initial 'buffer' is deleted, the memory pointer of the derived 'buffer' will refer to freed memory. This situation happens most often at the C programming level as in the following situation: PyObject *base = PyBuffer_New(100); PyObject *buffer = PyBuffer_FromObject(base); Py_DECREF(base); This problem is also fixable. And (3) the 'buffer' object cannot easily be used to allocate read-write memory at the Python programming level. The obvious approach is to use a 'string' as the base object of the 'buffer'. Yet, a 'string' is immutable which means the 'buffer' object derived from it is also immutable, even if problem (1) is fixed. The only alternative at the Python programming level is to use the cumbersome 'array' object or to create your own version of the 'buffer' object to allocate a block of memory. We feel that the solution to these and other problems is best illustrated by problem (3), which can essentially be described as the simple operation of allocating a block of read-write memory from the heap. Python currently provides no standard way of doing this. It is instead done by subterfuge at the C programming level using the 'string', 'array', or 'buffer' APIs. A solution to this specific problem is to include a 'malloc' object as part of standard Python. This object will be used to allocate a block of memory from the heap and the 'buffer' object will be use to access this memory just as it is used to access data from a memory-mapped file. Yet, this hints at a more general solution, the creation of two classes of objects, one for memory-allocation, and one for memory-access. The Model We propose a new memory-management model for Python which separates the allocation object from its access method. This mix-and-match memory model will enable various access objects, such as 'array', 'string', and 'file', to access easily the data from different types of memory, namely heap, shared, and memory-mapped files; or in other words, different types of memory can share a common interface (see figure below). It will also provide better support for the various types of memory found in modern operating systems. |---------------------------------------------------| | interface layer | | ----------------------------------------------- | | array | string | file | ... | |===================================================| | data layer | | ----------------------------------------------- | | heap memory | shared memory | memory mapped file | |---------------------------------------------------| Memory Objects Modern operating systems, such as Unix and Windows, provide access to several different types of memory, namely heap, shared, and memory-mapped files. These memory types share two common attributes, a pointer to the memory and the size of the memory. This information is usually sufficient for objects whose data uses heap memory, since the object is expected to have sole control over that memory throughout the lifetime of the object. For objects whose data also uses shared and memory-mapped files, an additional attribute is necessary for access permission. However, the issue of how to handle memory persistence across processes does not appear well-defined in modern OSs, but appears to be left to the programmer to implement. In any case, a fourth attribute to handle memory persistence seems imperative. Access Objects Consider 'array', 'buffer', and 'string' objects. Each provides, more or less, the same string-like interface to its underlying data. They each support slicing, indexing, concatenation, and replication of the data. They differ primarily in the types of initializing data and the permissions associated with the underlying data. Currently, the 'array' initializer accepts only 'list' and 'string' objects. If this was extended to include objects that support the 'buffer interface', then the distinction between the 'array' and 'buffer' objects would disappear, since they both support the sequence interface and the same set of base objects. The 'buffer' object is therefore redundant and no longer necessary. The 'string' and 'array' objects would still be distinct, since the 'array' object encompasses more data-types than does the 'string' object. The 'array' object is also mutable requiring its underlying data to be read-write, while the 'string' object is immutable requiring read-only data. This new memory-management model therefore suggests that the 'string' object support the 'buffer interface' with the proviso that the data have read-only permission. Implementation References Copyright This document has been placed in the public domain. -- Paul Barrett, PhD Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 [View Less]

2 1

Re: PEP 269
by Samuele Pedroni Sept. 13, 2001

Sept. 13, 2001

Hi personally I have the following concerns about PEP 269: - if it's purpose is to offer a framework for small languages support, there are already modules around that support that (SPARK, PLY ...), the only advantage of PEP 269 being speed wrt to the pure python solutions, because of the use of the internal CPython parser, OTOH the other solutions are more flexible... - or if's purpose is to help experimenting with the grammar unless support for adding keywords is added is a quite unfinished … [View More]tool. Further the PEP propose to use the actual AST format of parser module as output format. To be honest that format is quite awful, especially for general purpose use. It should be considered that Jython does not contain a parser similar to CPython one. Because of this jython does not offer parser module support. So implementing the PEP for Jython would require writing a Java or pure python equivalent of the CPython parser. My plans for resolving the lack of parser module support were to to implement an higher compatibility layer based on the AST format of tools/compiler, a more nicer format. PEP 269 adds issues to this open problem, which I would like to see addressed by future revisions and by further discussions. I can live with PEP 269 implemented only for CPython, for a lack of resources on Jython side, if is to be used for rare experimenting with the grammar. But it seems, as it is, a rather half-cooked solution to offer a module for mini language support in the standard library. regards, Samuele Pedroni. > From: Jonathan Riehl <jriehl(a)spaceship.com> > To: Martin von Loewis <loewis(a)informatik.hu-berlin.de> > cc: <python-list(a)python.org>, <types-sig(a)python.org> > MIME-Version: 1.0 > Subject: [Types-sig] Re: PEP 269 > X-BeenThere: types-sig(a)python.org > X-Mailman-Version: 2.0.6 (101270) > List-Help: <mailto:types-sig-request@python.org?subject=help> > List-Post: <mailto:types-sig@python.org> > List-Subscribe: <http://mail.python.org/mailman/listinfo/types-sig>, <mailto:types-sig-request@python.org?subject=subscribe> > List-Id: Special Interest Group on the Python type system <types-sig.python.org> > List-Unsubscribe: <http://mail.python.org/mailman/listinfo/types-sig>, <mailto:types-sig-request@python.org?subject=unsubscribe> > List-Archive: <http://mail.python.org/pipermail/types-sig/> > Date: Thu, 13 Sep 2001 14:49:32 -0500 (CDT) > > Howdy all, > I'm afraid Martin's attention to the PEP list has outted me > before I was able to post about this myself. Anyway, for those > interested, I wrote a PEP for the exposure of pgen to the Python > interpreter. You may view it at: > > http://python.sourceforge.net/peps/pep-0269.html > > I am looking for comments on this PEP, and below, I address some > interesting issues raised by Martin. Furthermore, I already have a > parially functioning reference implementation, and should be pestered to > make it available shortly. > > Thanks, > -Jon > > On Tue, 11 Sep 2001, Martin von Loewis wrote: > > > Hi Jonathan, > > > > With interest I noticed your proposal to include Pgen into the > > standard library. I'm not sure about the scope of the proposed change: > > Do you view pgen as a candidate for a general-purpose parser toolkit, > > or do you "just" contemplate using that for variations of the Python > > grammar? > > I am thinking of going for the low hanging fruit first (a Python centric > pgen module), and then adding more functionality for later releases of > Python (see below.) > > > If the former, I think there should be a strategy already how > > to expose pgen to the application; the proposed API seems > > inappropriate. In particular: > > > > - how would I integrate an alternative tokenizer? > > - how could I integrate semantic actions into the parse process, > > instead of creating the canonical AST? > > The current change proposed is somewhat restrained by the Python 2.2 > release schedule, and will initially only address building parsers that > use the Python tokenizer. If the module misses 2.2 release, I'd like to > make it more functional and provide the ability to override the Python > tokenizer. I may also add methods to export all the data found in the DFA > structure. > > I am unsure what the purpose of integration of semantics into the parse > process buys us besides lower memory overhead. In C/C++ such coupling is > needed because of the TYPEDEF/IDENTIFIER tokenization problem, but I > don't see Python and future Python-like, LL(1), languages needing such > hacks. Finally, I am prone to enforce the separation of the backend > actions from the AST. This allows the AST to be used for a variety of > purposes, rather than those intended by the initial parser developer. > > > Of course, these questions are less interesting if the scope is to > > parse Python: in that case, Python tokenization is fine, and everybody > > is used to getting the Python AST. > > An interesting note to make about this is that the since the nonterminal > integer values are generated by pgen, pgen AST's are not currently > compatible with the parser module AST's. Perhaps such unification may be > slated for future work (I know Fred left room in the parser AST datatype > for identification of the grammar that generated the AST using an integer > value, but using this would be questionable in a "rapid parser > development" environment.) > > > On the specific API, I think you should drop the File functions > > (parseGrammarFile, parseFile). Perhaps you can also drop the String > > functions, and provide only functions that expect file-like objects. > > I am open to further discussion on this, but I would note that filename > information is used (and useful) when reporting syntax errors. I think > that the "streaming" approach to parsing is another hold over from days > where memory constraints ruled (much like binding semantics to the parser > itself.) > > > On the naming of the API functions: I propose to use an underscore > > style instead of the mixedCaps style, or perhaps to leave out any > > structure (parsegrammar, buildparser, parse, symbol2string, > > string2symbolmap). That would be more in line with the parser module. > > I would like to hear more about this from the Pythonati. I am currently > following the naming conventions I use at work, which of course is most > natural for me at home. :) > > > > > Regards, > > Martin > > > > > > > _______________________________________________ > Types-SIG mailing list > Types-SIG(a)python.org > http://mail.python.org/mailman/listinfo/types-sig [View Less]

1 0

Free threading and borrowing references from mutable types
by Martin von Loewis Sept. 12, 2001

Sept. 12, 2001

Considering the free threading issue (again), I found that functions returning borrowed references are problematic if the container is mutable. In traditional Python, extension modules could safely borrow references if they know that they maintain a reference to the container. If a thread switch is possible between getting the borrowed reference and using it, then this assumption is wrong: another thread may remove the reference from the container, so that the object dies. Therefore, I … [View More]

3 3

interning string subclasses
by Guido van Rossum Sept. 11, 2001

Sept. 11, 2001

> Question: Should we complain if someone tries to intern an instance of > a string subclass? I hate to slow any code on those paths. I think in this case intern(s) should return intern(str(s)). The fast path checks ob_sinterned first, and that should always point to a real string for a string subclass. --Guido van Rossum (home page: http://www.python.org/~guido/)

2 1

[development doc updates]
by Fred L. Drake Sept. 11, 2001

Sept. 11, 2001

The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Miscellaneous updates, plus documentation for the new "hmac" module (located in the crypto chapter of the Library Reference).

1 0

Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1
by Greg Stein Sept. 11, 2001

Sept. 11, 2001

I've now created nondist/sandbox/Lib as a place where people can (cooperatively) develop modules intended for inclusion into the core's Lib directory. Of course, at your discretion, you can also create sandbox/big-project, but the sandbox/Lib directory could be handy for more people. I've checked in a non-working httpx, and the current davlib. These will get worked on over the next few weeks to prep them for the next release. Review and commentary are welcome! Cheers, -g On Mon, Sep 10, 2001 … [View More]

5 8

2.2a3 error messages
by Fredrik Lundh Sept. 11, 2001

Sept. 11, 2001

maybe it's just me, but I just spent five minutes trying to figure out why an innocent-looking line of code resulted in an "iter() of non-sequence" type error. I finally ran it under 2.1, and immediately realized what was wrong. is there any chance of getting the old, far more helpful "unpack non-sequence" and "loop over non-sequence" error messages back before 2.2 final? </F>

3 2

problem with inspect module and Jython
by James_Althoff＠i2.com Sept. 11, 2001

Sept. 11, 2001

Apologies for not being up to speed on the standard bug reporting process. There appears to be an incompatibility between the inspect module and Jython. The inspect module uses "type(xxx) is types.zzz" in a number of places. This seems to fail when inspect is used with Jython. Using "isinstance" instead works as shown in the example below. My understanding is that "isinstance" is the preferred idiom in any case. Jim =========================================== from the inspect module: … [View More]def iscode(object): """Return true if the object is a code object. Code objects provide these attributes: co_argcount number of arguments (not including * or ** args) co_code string of raw compiled bytecode co_consts tuple of constants used in the bytecode co_filename name of file in which this code object was created co_firstlineno number of first line in Python source code co_flags bitmap: 1=optimized | 2=newlocals | 4=*arg | 8 =**arg co_lnotab encoded mapping of line numbers to bytecode indices co_name name with which this code object was defined co_names tuple of names of local variables co_nlocals number of local variables co_stacksize virtual machine stack space required co_varnames tuple of names of arguments and local variables""" ###return type(object) is types.CodeType # <<< returns 0 (before reload below) return isinstance(object,types.CodeType) # <<< returns 1 (after reload below) Jython 2.1b1 on java1.3.0 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> from core.probe import tablepanel >>> import inspect >>> source = inspect.getsource(tablepanel.TablePanel.__init__) Traceback (innermost last): File "<console>", line 1, in ? File "C:\_Dev\pnp\3rdparty\jython\Lib\inspect.py", line 411, in getsource File "C:\_Dev\pnp\3rdparty\jython\Lib\inspect.py", line 400, in getsourcelines File "C:\_Dev\pnp\3rdparty\jython\Lib\inspect.py", line 280, in findsource IOError: could not get source code >>> reload(inspect) <module inspect at 4523599> >>> source = inspect.getsource(tablepanel.TablePanel.__init__) >>> source " def __init__(self,rowList=None,label=None):\n self.rowList = rowList or [['','','']]\n self.jtable = None\n from javax.swing.table import DefaultTableModel\n self.tabl eModel = DefaultTableModel(self.rowList,self.columnNameList)\n _super.__init__(self,label=lab el)\n" >>> [View Less]

2 1

More test problems
by Jack Jansen Sept. 10, 2001

Sept. 10, 2001

The recent mods to the test suite make my life a _lot_ simpler, thanks! I now have a new problem, one that I've seen in the past but always seems to go away all by itself. Urllib2 will fail when I run the whole regrtest suite: >>> import test.regrtest >>> test.regrtest.main() test_grammar [... many lines deleted] test_urllib2 test test_urllib2 crashed -- exceptions.AttributeError: 'module' object has no attribute 'error' But if I run only the urllib2 test in verbose mode it … [View More]

2 1