Mailman 3 November 2005 - Python-Dev

python-dev sprint at PyCon
by A.M. Kuchling Nov. 1, 2005

Nov. 1, 2005

Every PyCon has featured a python-dev sprint. For the past few years, hacking on the AST branch has been a tradition, but we'll have to come up with something new for this year's conference (in Dallas Texas; sprints will be Monday Feb. 27 through Thursday March 2). According to Anthony's release plan, a first alpha of 2.5 would be released in March, hence after PyCon and the sprints. We should discuss possible tasks for a python-dev sprint. What could we do? When the discussion is over, … [View More]

6 6

a different kind of reduce...
by Martin Blais Nov. 1, 2005

Nov. 1, 2005

Hi I find myself occasionally doing this: ... = dirname(dirname(dirname(p))) I'm always--literally every time-- looking for a more functional form, something that would be like this: # apply dirname() 3 times on its results, initializing with p ... = repapply(dirname, 3, p) There is a way to hack something like that with reduce, but it's not pretty--it involves creating a temporary list and a lambda function: ... = reduce(lambda x, y: dirname(x), [p] + [None] * 3) Just … [View More]

10 9

PEP 351, the freeze protocol
by Barry Warsaw Nov. 1, 2005

Nov. 1, 2005

I've had this PEP laying around for quite a few months. It was inspired by some code we'd written which wanted to be able to get immutable versions of arbitrary objects. I've finally finished the PEP, uploaded a sample patch (albeit a bit incomplete), and I'm posting it here to see if there is any interest. http://www.python.org/peps/pep-0351.html Cheers, -Barry

13 28

Re: [Python-Dev] a different kind of reduce...
by Delaney, Timothy (Tim) Nov. 1, 2005

Nov. 1, 2005

Reinhold Birkenfeld wrote: > And we have solved the "map, filter and reduce are going away! Let's > all weep together" problem with one strike! I'm not sure if you're wildly enthusiastic, or very sarcastic. I'm not sure which I should be either ... The thought does appeal to me - especially func.partial(args). I don't see any advantage to func.map(args) over func(*args), and it loses functionality in comparison with map(func, args) (passing the function as a separate reference). Tim Delaney

2 1

Re: [Python-Dev] apparent ruminations on mutable immutables (was:PEP 351, the freeze protocol)
by Delaney, Timothy (Tim) Nov. 1, 2005

Nov. 1, 2005

Noam, There's a simple solution to all this - write a competing PEP. One of the two competing PEPs may be accepted. FWIW, I'm +1 on PEP 351 in general, and -1 on what you've proposed. PEP 351 is simple to explain, simple to implement and leaves things under the control of the developer. I think there are still some issues to be resolved, but the basic premise is exactly what I would want of a freeze protocol. Tim Delaney

2 1

Re: [Python-Dev] python-dev sprint at PyCon
by Phillip J. Eby Nov. 1, 2005

Nov. 1, 2005

At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote: >* PEP 328 - absolute/relative import I assume that references to 2.4 in that PEP should be changed to 2.5, and so on. It also appears to me that the PEP doesn't record the issue brought up by some people about the current absolute/relative ambiguity being useful for packaging purposes. i.e., being able to nest third-party packages such that they end up seeing their dependencies, even though they're not installed at the "root" … [View More]

3 2

svn checksum error
by skip＠pobox.com Nov. 1, 2005

Nov. 1, 2005

I tried "svn up" to bring my sandbox up-to-date and got this output: % svn up U Include/unicodeobject.h subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017) svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b', actual: '2d28838f2fec366fc58386728a48568e' What's that telling me? Thx, Skip

5 6

Re: [Python-Dev] python-dev sprint at PyCon
by Phillip J. Eby Nov. 1, 2005

Nov. 1, 2005

At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote: >I guess this ought to be recorded. :-( > >The issue has been beaten to death and my position remains firm: >rather than playing namespace games, consistent renaming is the right >thing to do here. This becomes a trivial source edit, Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4 with the same code base. It'd be nice to have some other advice to offer people besides, "go edit your code". Of … [View More]

3 2

Re: [Python-Dev] python-dev sprint at PyCon
by Phillip J. Eby Nov. 1, 2005

Nov. 1, 2005

At 10:34 AM 11/1/2005 -0800, Neal Norwitz wrote: >Why can't you add your version's directory to sys.path before importing >pyexpat? With library code that can be imported in any order, there is no such thing as "before". Anyway, Guido has pronounced on this already, so it's moot.

1 0

Divorcing str and unicode (no more implicit conversions).
by Bengt Richter Oct. 31, 2005

Oct. 31, 2005

Please bear with me for a few paragraphs ;-) One aspect of str-type strings is the efficiency afforded when all the encoding really is ascii. If the internal encoding were e.g. fixed utf-16le for strings, maybe with today's computers it would still be efficient enough for most actual string purposes (excluding the current use of str-strings as byte sequences). I.e., you'd still have to identify what was "strings" (of characters) and what was really byte sequences with no implied or explicit … [View More]encoding or character semantics. Ok, let's make that distinction explicit: Call one kind of string a byte sequence and the other a character sequence (representation being a separate issue). A unicode object is of course the prime _general_ representation of a character sequence in Python, but all the names in python source code (that become NAME tokens) are UIAM also character sequences, and representable by a byte sequence interpreted according to ascii encoding. For the sake of discussion, suppose we had another _character_ sequence type that was the moral equivalent of unicode except for internal representation, namely a str subclass with an encoding attribute specifying the encoding that you _could_ use to decode the str bytes part to get unicode (which you wouldn't do except when necessary). We could call it class charstr(str): ... and have chrstr().bytes be the str part and chrstr().encoding specify the encoding part. In all the contexts where we have obvious encoding information, we can then generate a charstr instead of a str. E.g., if the source of module_a has # -*- coding: latin1 -*- cs = 'über-cool' then type(cs) # => <type 'charstr'> cs.bytes # => '\xfcber-cool' cs.encoding # => 'latin-1' and print cs would act like print cs.bytes.decode(cs.encoding) -- or I guess sys.stdout.write(cs.bytes.decode(cs.encoding).encode(sys.stdout.encoding) followed by sys.stdout.write('\n'.decode('ascii').encode(sys.stdout.encoding) for the newline of the print. Now if module_b has # -*- coding: utf8 -*- cs = 'über-cool' and we interactively import module_a, module_b and then print module_a.cs + ' =?= ' + module_b.cs what could happen ideally vs. what we have currently? UIAM, currently we would just get the concatenation of the three str byte sequences concatenated to make '\xfcber-cool =?= \xc3\xbcber-cool' and that would be printed as whatever that comes out as without conversion when seen by the output according to sys.stdout.encoding. But if those cs instances had been charstr instances, the coding cookie encoding information would have been preserved, and the interactive print could have evaluated the string expression -- given cs.decode() as sugar for (cs.bytes.decode(cs.encoding or globals().get('__encoding__') or __import__('sys').getdefaultencoding())) -- as module_a.cs.decode() + ' =?= '.decode() + module_b.cs.decode() if pairwise terms differ in encoding as they might all here. If the interactive session source were e.g. latin-1, like module_a, then module_a.cs + ' =?= ' would not require an encoding change, because the ' =?= ' would be a charstr instance with encoding == 'latin-1', and so the result would still be latin-1 that far. But with module_b.cs being utf8, the next addition would cause the .decode() promotions to unicode. In a console window, the ' =?= '.encoding might be 'cp437' or such, and the first addition would then cause promotion (since module_a.cs.encoding != 'cp437'). I have sneaked in run-time access to individual modules' encodings by assuming that the encoding cookie could be compiled in as an explicit global __encoding__ variable for any given module (what to have as __encoding__ for built-in modules could vary for various purposes). ISTM this could have use in situations where an encoding assumption is necessary and currently 'ascii' is not as good a guess as one could make, though I suspect if string literals became charstr strings instead of str strings, many if not most of those situations would disappear (I'm saying this because ATM I can't think of an 'ascii'-guess situation that wouldn't go away ;-) If there were a charchr() version of chr() that would result in a charstr instead of a str, IWT one would want an easy-sugar default encoding assumption, probably based on the same as one would assume for '%c' % num in a given module source -- which presumably would be '%c'.encoding, where '%c' assumes the encoding of the module source, normally recorded in __encoding__. So charchr(n) would act like chr(n).decode().encode(''.encoding) -- or more reasonably charstr(chr(n)), which would be short for charstr(chr(n), globals().get('__encoding__') or __import__('sys').getdefaultencoding()) Or some efficient equivalent ;-) Using strings in dicts requires hashing to find key comparison candidates and comparison to check for key equivalence. This would seem to point to some kind of normalized hashing, but not necessarily normalized key representation. Some is apparently happening, since >>> hash('a') == hash(unicode('a')) True I don't know what would be worth the trouble to optimize string key usage where strings are really all of one encoding vs totally general use vs a heavily biased mix. Or even if it could be done without unreasonable complexity. Maybe a dict could be given an option to hash all its keys as unicode vs whatever it does now. But having a charstr subtype of str would improve the "implicit" conversions to unicode IMO. Anyway, I wanted to throw in my .02USD re the implicit conversions, taking the view that much of the implicitness could be based on reliable inferences from source encodings of string literals or from their effects as format strings. Regards, Bengt Richter [not a normal subscriber to python-dev, so I'll have to google for any responses] [View Less]

22 76