I find myself occasionally doing this:
... = dirname(dirname(dirname(p)))
I'm always--literally every time-- looking for a more functional form,
something that would be like this:
# apply dirname() 3 times on its results, initializing with p
... = repapply(dirname, 3, p)
There is a way to hack something like that with reduce, but it's not
pretty--it involves creating a temporary list and a lambda function:
... = reduce(lambda x, y: dirname(x), [p] + [None] * 3)
Just wondering, does anybody know how to do this nicely? Is there an
easy form that allows me to do this?
I've had this PEP laying around for quite a few months. It was inspired
by some code we'd written which wanted to be able to get immutable
versions of arbitrary objects. I've finally finished the PEP, uploaded
a sample patch (albeit a bit incomplete), and I'm posting it here to see
if there is any interest.
Reinhold Birkenfeld wrote:
> And we have solved the "map, filter and reduce are going away! Let's
> all weep together" problem with one strike!
I'm not sure if you're wildly enthusiastic, or very sarcastic.
I'm not sure which I should be either ...
The thought does appeal to me - especially func.partial(args). I don't
see any advantage to func.map(args) over func(*args), and it loses
functionality in comparison with map(func, args) (passing the function
as a separate reference).
There's a simple solution to all this - write a competing PEP. One of
the two competing PEPs may be accepted.
FWIW, I'm +1 on PEP 351 in general, and -1 on what you've proposed.
PEP 351 is simple to explain, simple to implement and leaves things
under the control of the developer. I think there are still some issues
to be resolved, but the basic premise is exactly what I would want of a
At 10:22 AM 11/1/2005 -0700, Guido van Rossum wrote:
>* PEP 328 - absolute/relative import
I assume that references to 2.4 in that PEP should be changed to 2.5, and
It also appears to me that the PEP doesn't record the issue brought up by
some people about the current absolute/relative ambiguity being useful for
packaging purposes. i.e., being able to nest third-party packages such
that they end up seeing their dependencies, even though they're not
installed at the "root" package level.
For example, I have a package that needs Python 2.4's version of pyexpat,
and I need it to run in 2.3, but I can't really overwrite the 2.3 pyexpat,
so I just build a backported pyexpat and drop it in the package, so that
the code importing it just ends up with the right thing.
Of course, that specific example is okay since 2.3 isn't going to somehow
grow absolute importing. :) But I think people brought up other examples
besides that, it's just the one that I personally know I've done.
I tried "svn up" to bring my sandbox up-to-date and got this output:
% svn up
svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b', actual: '2d28838f2fec366fc58386728a48568e'
What's that telling me?
At 11:14 AM 11/1/2005 -0700, Guido van Rossum wrote:
>I guess this ought to be recorded. :-(
>The issue has been beaten to death and my position remains firm:
>rather than playing namespace games, consistent renaming is the right
>thing to do here. This becomes a trivial source edit,
Well, it's not trivial if you're (in my case) trying to support 2.3 and 2.4
with the same code base.
It'd be nice to have some other advice to offer people besides, "go edit
your code". Of course, if the feature hadn't already existed, I suppose a
PEP to add it would have been shot down, so it's a reasonable decision.
>which beats the
>problems of debugging things when it doesn't work out as expected
>(which is very common due to the endless subtleties of loading
>multiple versions of the same code).
Yeah, Bob Ippolito and I batted around a few ideas about how to implement
simultaneous multi-version imports for Python Eggs, some of which relied on
the relative/absolute ambiguity, but I think the main subtleties have to do
with dynamic imports (including pickling) and the use of __name__.
Of course, since we never actually implemented it, I don't know what other
subtleties could potentially exist. Python Eggs currently allow you to
install multiple versions of a package, but at runtime you can only import
one of them, and you get a runtime VersionConflict exception if two eggs'
version criteria are incompatible.
At 10:34 AM 11/1/2005 -0800, Neal Norwitz wrote:
>Why can't you add your version's directory to sys.path before importing
With library code that can be imported in any order, there is no such thing
as "before". Anyway, Guido has pronounced on this already, so it's moot.
Please bear with me for a few paragraphs ;-)
One aspect of str-type strings is the efficiency afforded when all the encoding really
is ascii. If the internal encoding were e.g. fixed utf-16le for strings, maybe with today's
computers it would still be efficient enough for most actual string purposes (excluding
the current use of str-strings as byte sequences).
I.e., you'd still have to identify what was "strings" (of characters) and what was really
byte sequences with no implied or explicit encoding or character semantics.
Ok, let's make that distinction explicit: Call one kind of string a byte sequence and the
other a character sequence (representation being a separate issue).
A unicode object is of course the prime _general_ representation of a character sequence
in Python, but all the names in python source code (that become NAME tokens) are UIAM
also character sequences, and representable by a byte sequence interpreted according to
For the sake of discussion, suppose we had another _character_ sequence type that was
the moral equivalent of unicode except for internal representation, namely a str
subclass with an encoding attribute specifying the encoding that you _could_ use
to decode the str bytes part to get unicode (which you wouldn't do except when necessary).
We could call it class charstr(str): ... and have chrstr().bytes be the str part and
chrstr().encoding specify the encoding part.
In all the contexts where we have obvious encoding information, we can then generate
a charstr instead of a str. E.g., if the source of module_a has
# -*- coding: latin1 -*-
cs = 'über-cool'
type(cs) # => <type 'charstr'>
cs.bytes # => '\xfcber-cool'
cs.encoding # => 'latin-1'
and print cs would act like print cs.bytes.decode(cs.encoding) -- or I guess
for the newline of the print.
Now if module_b has
# -*- coding: utf8 -*-
cs = 'über-cool'
and we interactively
import module_a, module_b
print module_a.cs + ' =?= ' + module_b.cs
what could happen ideally vs. what we have currently?
UIAM, currently we would just get the concatenation of
the three str byte sequences concatenated to make
'\xfcber-cool =?= \xc3\xbcber-cool'
and that would be printed as whatever that comes out as
without conversion when seen by the output according to
But if those cs instances had been charstr instances, the coding cookie
encoding information would have been preserved, and the interactive print could
have evaluated the string expression -- given cs.decode() as sugar for
(cs.bytes.decode(cs.encoding or globals().get('__encoding__') or
module_a.cs.decode() + ' =?= '.decode() + module_b.cs.decode()
if pairwise terms differ in encoding as they might all here. If the interactive
session source were e.g. latin-1, like module_a, then
module_a.cs + ' =?= '
would not require an encoding change, because the ' =?= ' would be a charstr instance
with encoding == 'latin-1', and so the result would still be latin-1 that far.
But with module_b.cs being utf8, the next addition would cause the .decode() promotions
to unicode. In a console window, the ' =?= '.encoding might be 'cp437' or such, and
the first addition would then cause promotion (since module_a.cs.encoding != 'cp437').
I have sneaked in run-time access to individual modules' encodings by assuming that
the encoding cookie could be compiled in as an explicit global __encoding__ variable
for any given module (what to have as __encoding__ for built-in modules could vary for
ISTM this could have use in situations where an encoding assumption is necessary and
currently 'ascii' is not as good a guess as one could make, though I suspect if string
literals became charstr strings instead of str strings, many if not most of those situations
would disappear (I'm saying this because ATM I can't think of an 'ascii'-guess situation that
wouldn't go away ;-) If there were a charchr() version of chr() that would result in
a charstr instead of a str, IWT one would want an easy-sugar default encoding assumption,
probably based on the same as one would assume for '%c' % num in a given module source
-- which presumably would be '%c'.encoding, where '%c' assumes the encoding of the module
source, normally recorded in __encoding__. So charchr(n) would act like chr(n).decode().encode(''.encoding) -- or more reasonably charstr(chr(n)), which would be
charstr(chr(n), globals().get('__encoding__') or __import__('sys').getdefaultencoding())
Or some efficient equivalent ;-)
Using strings in dicts requires hashing to find key comparison candidates and comparison to
check for key equivalence. This would seem to point to some kind of normalized hashing, but
not necessarily normalized key representation. Some is apparently happening, since
>>> hash('a') == hash(unicode('a'))
I don't know what would be worth the trouble to optimize string key usage where strings are
really all of one encoding vs totally general use vs a heavily biased mix. Or even if it could
be done without unreasonable complexity. Maybe a dict could be given an option to hash all
its keys as unicode vs whatever it does now. But having a charstr subtype of str would improve
the "implicit" conversions to unicode IMO.
Anyway, I wanted to throw in my .02USD re the implicit conversions, taking the view that
much of the implicitness could be based on reliable inferences from source encodings of
string literals or from their effects as format strings.
[not a normal subscriber to python-dev, so I'll have to google for any responses]