Sharing the joy: after Greg's checkins today, the Berkeley tests
(test_anydbm and, with -u bsddb, test_bsddb3) all pass on WinXP. Yay!
That means there's nothing I routinely expect to fail all the time on
Windows anymore <0.9 wink>.

> Even with cPickle?
AFAIR: Yes. It was not orders of magnitude slower anymore,
but significantly slower.
> I'm still worried about your patch, because it changes all uses of
> marchal, not just how code objects are marshalled. The marshal module
> *is* used by other apps...
The change is backwards-compatible in the sense that existing files can
be unmarshalled just fine. Problems will only arise if new marshal output
is unmarshalled by old versions, which could be solved with an option
during marshalling.
Regards,
Martin

Hello,
I've sent this question to comp.lang.python and didn't get any response.
Do you know of a reference implementation of an import hook? I mean a
class, written in Python, that imitates Python's default module loading
behaviour. I mean that I would be able to append the class to
sys.path_hooks, and module loading won't be affected.
I want it so that it would be easy for me to modify it, or subclass it,
for my own needs, while preserving its behaviour. (My situation, if you
are interested, is that in my current setup, modules are loaded from the
network, which takes a long time. I want to create a caching mechanism
which will use the local hard disk.)
It seems to me that such an implementation should be added to the
standard library or at least to the documentation, since it would help
anyone who would like to write an import hook, and would also help
clarify Python's import mechanism. If such an implementation doesn't
exist, I would probably write one anyway, and I would be willing to
share it, if people want it.
Thank you,
Noam Raphael

(previously submitted to comp.lang.python)
I have managed to build Python 2.3.3 with the aCC HP-UX C++
compiler by making a number of one-line changes. (For reasons
beyond my control, I am forced to use this compiler and it has
no C mode at all.)
Typically to get the Python source code to compile under
aCC one has to make a number of trivial changes of the form,
struct whatzit *p;
- p = malloc (sizeof (struct whatzit));
+ p = (struct whatzit *) malloc (sizeof (struct whatzit));
since aCC has stricter casting rules than ANSI C and does
not automatically cast void * .
Another change is where a forward declaration is
needed for the module type. The aCC compiler complines
about a duplicate definition. I change these from "static"
to "extern" which gives a warning, but otherwise works.
For example,
+ #define staticforward ... /* in my case 'extern' */
- static PyTypeObject Comptype;
+ staticforward PyTypeObject Comptype;
(There is/was a staticforward macro which is not used
consistently.)
A third change are the Python module initializers
(PyMODINIT_FUNC xxx(void) {...): they need to obviously
be declared 'extern "C"' (for dl importing) which can
happen in the PyMODINIT_FUNC macro. However the macro
is not used consistently throughout the Python sources.
Finally, of course there are numerous uses of "new",
"class" and other C++ keywords. I wrote a short flex
script to search and replace through the entire sources
for instances of these.
To summarize the changes needed:
1. explicit casting of void *
2. consistant use of a "staticforward" type
for PyTypeObject forward declarations.
3. consinstant use of PyMODINIT_FUNC.
4. use of PyMODINIT_FUNC even in prototypes
(like config.c.in)
5. renaming of C++ reserved words.
(There are other changes specific to the HP-UX
architecture - too numerous to mention.)
My question is: are the Python maintainers interested
in such compatibility?
Although Python will always be strict ANSI C, are such
changes not of general interest for the purposes of
consistency of the source code?
Can someone forward this email to the appropriate
developers list (or tell me which one)?
Shall I prepare a proper patch against 2.3.4?
What would the consensus be on replacements for
'new', 'class', 'template', 'operator', etc.?
Perhaps __new, zew, or new2; klass, __class, or
cla55 etc.?
Has this issue come up before? URLs?
Many thanks, best wishes
-paul
Paul Sheer . . . . . . . . . . . . . . . . . Tel . . +27 (0)21 6869634
Email . . . http://2038bug.com/email.gif . . . . . . . . . . . . . . . .
http://www.icon.co.za/~psheer . . . . . . . . . http://rute.2038bug.com
L I N U X . . . . . . . . . . . . . . . . The Choice of a GNU Generation

I've just submitted patch #980500 to SF, which implements the
gettext improvements exactly as recently discussed. Every
review is welcome. Here is an excerpt:
In _locale module:
- bind_textdomain_codeset() binding
In gettext module:
- bind_textdomain_codeset() function
- lgettext(), lngettext(), ldgettext(), ldngettext(),
which return translated strings encoded in
preferred system encoding, if bind_textdomain_codeset()
was not used.
- Added equivalent functionality in install() and translate()
functions and catalog classes.
- Documented every change.
--
Gustavo Niemeyer
http://niemeyer.net

I've been messing around with dynamically generating type objects from C.
In particular, I made a module to generate structseq type objects.
See sourceforge bug 624827 and patch 980098.
I ran into a fair amount of difficulty in doing this. There doesn't
appear to be a direct C API for doing this. One possible method is
calling PyType_Type.tp_new() and then updating the returned type object as
needed, but that seemed a little dirty.
First, could someone tell me if I took the correct route for creating a
type object on the heap (based on my patch).
Second, shouldn't there be a more direct API for doing this? Perhaps a
more generalized interface to type_new(), or maybe something completely
new??
-Eric

As you may know, the method u"abc".encode(encoding) currently
guarantees that the return value will always be an 8-bit string
value.
Now that more and more codecs become available and the scope
of those codecs goes far beyond only encoding from Unicode to
strings and back, I am tempted to open up that restriction,
thereby opening up u.encode() for applications that wish to
use other codecs that return e.g. Unicode objects as well.
There are several applications for this, such as character
escaping, remapping characters (much like you would use
string.translate() on 8-bit strings), compression, etc. etc.
Note that codecs are not restricted in what they can return
for their .encode() or .decode() method, so any object
type is acceptable, including subclasses of str or
unicode, buffers, mmapped files, etc.
The needed code change is a one-liner.
What do you think ?
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Jun 16 2004)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

While tearing my hair out trying to figure out why my dynamically
allocated type objects weren't getting garbage collected, I noticed that
the tp_new_wrapper object had 1 too many ref counts.
Does the following patch make sense? Seems fairly straightforward to me.
-Eric

> #- > I maintain that when comparing a long with a float
> #- > where the exponent is larger than the precision, that the
> #- > float should be treated as if it were EXACTLY EQUAL TO
> #- > <coefficient>*2^<exponent>, instead of trying to treat it as
> #- > some sort of a range. Similarly, if we implemented a Rational
> #- > type, I would suggest that instances of float (or of Facundo's
> #- > Decimal) where <exponent> is LESS than the digits of
> #- > <coefficient> should be treated as if they were EXACTLY EQUAL
> #- > TO the corresponding fraction-over-a-power-of-two.
Facundo writes:
> Don't get to understand this. Could you send please an
> example of Decimal to
> this case?
Presume a Rational type:
>>> class Rational(object):
>>> def __init__(self, num, denom):
>>> self.num = num
>>> self.denom = denom
>>> # ....
Now define the following
>>> x = 12300
>>> y = Decimal.Decimal((0,(1,2,3),2))
>>> z = Rational(1230000,100)
I maintain that x == y should be true (and it is, today).
I maintain that x == z should be true (and it is, obviously).
I maintain that y == z should be true. _IF_ you think
of Decimal.Decimal((0,(1,2,3),2)) as being "any number which,
when rounded to the nearest hundred, gives 12300" (in other
words, as the range 12250..12350), then you might not agree.
But if you think of Decimal.Decimal((0,(1,2,3),2)) as
being *exactly* 12300 (but with a precision of only 3 places),
then you'd want x == z to be true.
-- Michael Chermside
This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it.

Armin Rigo writes:
> Is it really what we want? It seems to me that the precision
> of a float gives
> roughly the magnitude of "uncertainty" of the value that it
> represents. If a
> float is large enough, then the same float can represent
> several long integer
> values. It should then probably be considered equal to any
> of these integers,
> instead of just equal to the one arbitrarily returned by
> casting it to long.
I think that what you describe is NOT the behavior that we
want, and I'll try to give an explanation of why. But please
bear with me... I'm still a beginner at understanding some
of the theory behind numerical computing, so I may make some
mistakes. Hopefully those who understand it better will
correct me.
It's a fact that only certain numbers can be represented exactly
when using numbers of finite precision. For instance, floating
point numbers where the exponent is larger than the number of
digits of precision can represent certain integers but not
others. The same situation arises in the use of floating (or
fixed) point decimals -- only certain fractions can be expressed
exactly.
There are (at least) two different mental models we can use to
deal with this limitation of finite precision. One (which you
are espousing above) is to treat each limited-precision
number as if it had an inherent uncertainty, and it represented
all those numbers which are closer to that representable
number than any other. The other mental model is to suppose
that each limited-precision number represents just one
specific value and that the other values simply cannot be
represented.
I always understand things better when I have concrete examples,
so let me use numbers in base 10 with a precision of 3 to
illustrate these two viewpoints. With a precision of 3 we can
write out the values 2.34x10^4 for the number 23,400 and
2.35x10^4 for the number 23,500. But 23,401, 23,402, and all the
others in between can't be represented. So by the "Error bar"
mental model, we think of "2.34x10^4" as shorthand for "some
number from 23,350 to 23,450". By the "Exact values" mental
model we think of "2.34x10^4" as meaning "The exact value
23,340".
Now that we know there are two different mental models, we
have to wonder whether one is inherently better than the
other (and whether it makes any difference). I maintain that
it DOES make a difference, and that the "Exact values" mental
model is BETTER than the "Error bar" model. The reason why it
is better doesn't come into play until you start performing
operations on the numbers, so let's examine that.
Suppose that you are adding two numbers under the "Exact
values" mental model. Adding 1.23x10^4 and 3.55x10^4 is quite
easy... the exact sum of 12,300 and 35,500 is 47,800 or
4.78x10^4. But if we add 1.23x10^4 and 8.55x10^2, then the
"real" sum is 12,300 + 855 = 13,155. That CAN'T be represented
in the "exact values" mental model, so we have to ROUND IT
OFF and store "1.32x10^4" instead. In general, the values are
quite clear in the "Exact values" model, but OPERATIONS may
require rounding to express the result.
Proponents of the "Error bar" mental model are now thinking
"yes, well our model trades complexity of VALUES for
simplicity in OPERATIONS". But unfortunately that's just not
true. Again, let us try to add 1.23x10^4 and 3.55x10^4. What
that REALLY means (in the "Error bar" mental model) is that
we are adding SOME number in the range 12,250..12,350 to SOME
number in the range 35,450..35,550. The result must be some
number in the range 47,700..47,900. That is LIKELY to belong
to the range 47,750..47,850 (ie 4.78x10^4), but it MIGHT also
come out to something in the 4.77x10^4 range or the 4.79x10^4
range. We intent to return a result of 1.78x10^4, so operations
in the "Error bar" mental model aren't simple, but contain
some sort of "rounding off" as well, only the exact rules for
how THIS "rounding off" works are a bit more confusing -- I'm
not sure I could express them very well.
(Side Note: You might think to fix this problem by saying that
instead of perfect ranges, the numbers represent probability
curves centered around the given number. By using gaussian
curves, you might even be able to make addition "simple" (at
the cost of a very complex idea of "values"). But whatever
distribution worked for addition would NOT work for
multiplication or other operations, so this trick is doomed
to failure.)
So it turns out that WHATEVER you do, operations on finite-
precision numbers are going to require some sort of a "rounding
off" step to keep the results within the domain. That being
said, the "Exact values" mental model is superior. Not only
does it allow a very simple interpretation of the "values",
but it *also* allows the definition of "rounding off" to be
part of the definition of the operations and not be linked to
the values themselves. (For instance, I prefer a simple
"round-to-nearest" rule, but someone else may have a different
rule and that only affects how they perform operations.) It
is for this reason that (I think) most serious specifications
for the behavior of finite-precision numbers prefer to use
the "Exact value" mental model.
(Another side note: You will notice that the argument here
is strikingly similar to the behavior of PEP 327's Decimal
type. That's because I learned just about everything I'm saying
here from following the discussions (and doing the recommended
reading) about Facundo's work there.)
Okay, in my long-winded fashion, I may have shown that the
"Exact values" mental model is superior to the "Error bars"
model. But does it really make any difference to anyone other
than writers of numerical arithmetic specifications? It does,
and one of the best examples is the one that started this
conversation... comparing instances of different numerical
types. I maintain that when comparing a long with a float
where the exponent is larger than the precision, that the
float should be treated as if it were EXACTLY EQUAL TO
<coefficient>*2^<exponent>, instead of trying to treat it as
some sort of a range. Similarly, if we implemented a Rational
type, I would suggest that instances of float (or of Facundo's
Decimal) where <exponent> is LESS than the digits of
<coefficient> should be treated as if they were EXACTLY EQUAL
TO the corresponding fraction-over-a-power-of-two.
All right, those who ACTUALLY understand this stuff (rather
than having learned it from this newsgroup) can go ahead and
correct me now.
Once-a-teacher,-always-trying-to-give-a-lecture-lly yours,
-- Michael Chermside
This email may contain confidential or privileged information. If you believe you have received the message in error, please notify the sender and delete the message without copying or disclosing it.