--- "Barry A. Warsaw" <bwarsaw(a)cnri.reston.va.us>
wrote:
>
> I'm starting to think about devday topics. Sounds
> like an I18n
> session would be very useful. Champions?
>
I'm willing to explain what the fuss is about to
bemused onlookers and give some examples of problems
it should be able to solve - plenty of good slides and
screen shots. I'll stay well away from the C
implementation issues.
Regards,
Andy
=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.
__________________________________________________
Do You Yahoo!?
Bid and sell for free at http://auctions.yahoo.com
> 2. Are there plans for an internationalization
> session at IPC8? Perhaps a
> few key players could be locked into a room for a
> couple days, to emerge
> bloodied, but with an implementation in-hand...
Excellent idea.
- Andy
=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.
__________________________________________________
Do You Yahoo!?
Bid and sell for free at http://auctions.yahoo.com
> a slightly hairer design issue is what combinations
> of pattern and string the new 're' will handle.
>
> the first two are obvious:
>
> ordinary pattern, ordinary string
> unicode pattern, unicode string
>
> but what about these?
>
> ordinary pattern, unicode string
> unicode pattern, ordinary string
I think the logical thing to do would be to "promote" the ordinary pattern or
string to unicode, in a similar way to what happens if you combine ints and
floats in a single expression.
The result may be a bit surprising if your pattern is in ascii and you've
never been aware of unicode and are given such a string from somewhere else,
but then if you're only aware of integer arithmetic and are suddenly presented
with a couple of floats you'll also be pretty surprised at the result. At
least it's easily explained.
--
Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen(a)oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm
Guido has asked me to get involved in this discussion,
as I've been working practically full-time on i18n for
the last year and a half and have done quite a bit
with Python in this regard. I thought the most
helpful thing would be to describe the real-world
business problems I have been tackling so people can
understand what one might want from an encoding
toolkit. In this (long) post I have included:
1. who I am and what I want to do
2. useful sources of info
3. a real world i18n project
4. what I'd like to see in an encoding toolkit
Grab a coffee - this is a long one.
1. Who I am
--------------
Firstly, credentials. I'm a Python programmer by
night, and when I can involve it in my work which
happens perhaps 20% of the time. More relevantly, I
did a postgrad course in Japanese Studies and lived in
Japan for about two years; in 1990 when I returned, I
was speaking fairly fluently and could read a
newspaper with regular reference tio a dictionary.
Since then my Japanese has atrophied badly, but it is
good enough for IT purposes. For the last year and a
half I have been internationalizing a lot of systems -
more on this below.
My main personal interest is that I am hoping to
launch a company using Python for reporting, data
cleaning and transformation. An encoding library is
sorely needed for this.
2. Sources of Knowledge
------------------------------
We should really go for world class advice on this.
Some people who could really contribute to this
discussion are:
- Ken Lunde, author of "CJKV Information Processing"
and head of Asian Type Development at Adobe.
- Jeffrey Friedl, author of "Mastering Regular
Expressions", and a long time Japan resident and
expert on things Japanese
- Maybe some of the Ruby community?
I'll list up books URLs etc. for anyone who needs them
on request.
3. A Real World Project
----------------------------
18 months ago I was offered a contract with one of the
world's largest investment management companies (which
I will nickname HugeCo) , who (after many years having
analysts out there) were launching a business in Japan
to attract savers; due to recent legal changes,
Japanese people can now freely buy into mutual funds
run by foreign firms. Given the 2% they historically
get on their savings, and the 12% that US equities
have returned for most of this century, this is a
business with huge potential. I've been there for a
while now,
rotating through many different IT projects.
HugeCo runs its non-US business out of the UK. The
core deal-processing business runs on IBM AS400s.
These are kind of a cross between a relational
database and a file system, and speak their own
encoding called EBCDIC. Five years ago the AS400
had limited
connectivity to everything else, so they also started
deploying Sybase databases on Unix to support some
functions. This means 'mirroring' data between the
two systems on a regular basis. IBM has always
included encoding information on the AS400 and it
converts from EBCDIC to ASCII on request with most of
the transfer tools (FTP, database queries etc.)
To make things work for Japan, everyone realised that
a double-byte representation would be needed.
Japanese has about 7000 characters in most IT-related
character sets, and there are a lot of ways to store
it. Here's a potted language lesson. (Apologies to
people who really know this field -- I am not going to
be fully pedantic or this would take forever).
Japanese includes two phonetic alphabets (each with
about 80-90 characters), the thousands of Kanji, and
English characters, often all in the same sentence.
The first attempt to display something was to
make a single -byte character set which included
ASCII, and a simplified (and very ugly) katakana
alphabet in the upper half of the code page. So you
could spell out the sounds of Japanese words using
'half width katakana'.
The basic 'character set' is Japan Industrial Standard
0208 ("JIS"). This was defined in 1978, the first
official Asian character set to be defined by a
government. This can be thought of as a printed
chart
showing the characters - it does not define their
storage on a computer. It defined a logical 94 x 94
grid, and each character has an index in this grid.
The "JIS" encoding was a way of mixing ASCII and
Japanese in text files and emails. Each Japanese
character had a double-byte value. It had 'escape
sequences' to say 'You are now entering ASCII
territory' or the opposite. In 1978 Microsoft
quickly came up with Shift-JIS, a smarter encoding.
This basically said "Look at the next byte. If below
127, it is ASCII; if between A and B, it is a
half-width
katakana; if between B and C, it is the first half of
a double-byte character and the next one is the second
half". Extended Unix Code (EUC) does similar tricks.
Both have the property that there are no control
characters, and ASCII is still ASCII. There are a few
other encodings too.
Unfortunately for me and HugeCo, IBM had their own
standard before the Japanese government did, and it
differs; it is most commonly called DBCS (Double-Byte
Character Set). This involves shift-in and shift-out
sequences (0x16 and 0x17, cannot remember which way
round), so you can mix single and double bytes in a
field. And we used AS400s for our core processing.
So, back to the problem. We had a FoxPro system using
ShiftJIS on the desks in Japan which we wanted to
replace in stages, and an AS400 database to replace it
with. The first stage was to hook them up so names
and addresses could be uploaded to the AS400, and data
files consisting of daily report input could be
downloaded to the PCs. The AS400 supposedly had a
library which did the conversions, but no one at IBM
knew how it worked. The people who did all the
evaluations had basically proved that 'Hello World' in
Japanese could be stored on an AS400, but never looked
at the conversion issues until mid-project. Not only
did we need a conversion filter, we had the problem
that the character sets were of different sizes. So
it was possible - indeed, likely - that some of our
ten thousand customers' names and addresses would
contain characters only on one system or the other,
and fail to
survive a round trip. (This is the absolute key issue
for me - will a given set of data survive a round trip
through various encoding conversions?)
We figured out how to get the AS400 do to the
conversions during a file transfer in one direction,
and I wrote some Python scripts to make up files with
each official character in JIS on a line; these went
up with conversion, came back binary, and I was able
to build a mapping table and 'reverse engineer' the
IBM encoding. It was straightforward in theory, "fun"
in practice. I then wrote a python library which knew
about the AS400 and Shift-JIS encodings, and could
translate a string between them. It could also detect
corruption and warn us when it occurred. (This is
another key issue - you will often get badly encoded
data, half a kanji or a couple of random bytes, and
need to be clear on your strategy for handling it in
any library). It was slow, but it got us our gateway
in both directions, and it warned us of bad input. 360
characters in the DBCS encoding actually appear twice,
so perfect round trips are impossible, but practically
you can survive with some validation of input at both
ends. The final story was that our names and
addresses were mostly safe, but a few obscure symbols
weren't.
A big issue was that field lengths varied. An address
field 40 characters long on a PC might grow to 42 or
44 on an AS400 because of the shift characters, so the
software would truncate the address during import, and
cut a kanji in half. This resulted in a string that
was illegal DBCS, and errors in the database. To
guard against this, you need really picky input
validation. You not only ask 'is this string valid
Shift-JIS', you check it will fit on the other system
too.
The next stage was to bring in our Sybase databases.
Sybase make a Unicode database, which works like the
usual one except that all your SQL code suddenly
becomes case sensitive - more (unrelated) fun when
you have 2000 tables. Internally it stores data in
UTF8, which is a 'rearrangement' of Unicode which is
much safer to store in conventional systems.
Basically, a UTF8 character is between one and three
bytes, there are no nulls or control characters, and
the ASCII characters are still the same ASCII
characters. UTF8<->Unicode involves some bit
twiddling but is one-to-one and entirely algorithmic.
We had a product to 'mirror' data between AS400 and
Sybase, which promptly broke when we fed it Japanese.
The company bought a library called Unilib to do
conversions, and started rewriting the data mirror
software. This library (like many) uses Unicode as a
central point in all conversions, and offers most of
the world's encodings. We wanted to test it, and used
the Python routines to put together a regression
test. As expected, it was mostly right but had some
differences, which we were at least able to document.
We also needed to rig up a daily feed from the legacy
FoxPro database into Sybase while it was being
replaced (about six months). We took the same
library, built a DLL wrapper around it, and I
interfaced to this with DynWin , so we were able to do
the low-level string conversion in compiled code and
the high-level
control in Python. A FoxPro batch job wrote out
delimited text in shift-JIS; Python read this in, ran
it through the DLL to convert it to UTF8, wrote that
out as UTF8 delimited files, ftp'ed them to an
in directory on the Unix box ready for daily import.
At this point we had a lot of fun with field widths -
Shift-JIS is much more compact than UTF8 when you have
a lot of kanji (e.g. address fields).
Another issue was half-width katakana. These were the
earliest attempt to get some form of Japanese out of a
computer, and are single-byte characters above 128 in
Shift-JIS - but are not part of the JIS0208 standard.
They look ugly and are discouraged; but when you ar
enterinh a long address in a field of a database, and
it won't quite fit, the temptation is to go from
two-bytes-per -character to one (just hit F7 in
windows) to save space. Unilib rejected these (as
would Java), but has optional modes to preserve them
or 'expand them out' to their full-width equivalents.
The final technical step was our reports package.
This is a 4GL using a really horrible 1980s Basic-like
language which reads in fixed-width data files and
writes out Postscript; you write programs saying 'go
to x,y' and 'print customer_name', and can build up
anything you want out of that. It's a monster to
develop in, but when done it really works -
million page jobs no problem. We had bought into this
on the promise that it supported Japanese; actually, I
think they had got the equivalent of 'Hello World' out
of it, since we had a lot of problems later.
The first stage was that the AS400 would send down
fixed width data files in EBCDIC and DBCS. We ran
these through a C++ conversion utility, again using
Unilib. We had to filter out and warn about corrupt
fields, which the conversion utility would reject.
Surviving records then went into the reports program.
It then turned out that the reports program only
supported some of the Japanese alphabets.
Specifically, it had a built in font switching system
whereby when it encountered ASCII text, it would flip
to the most recent single byte text, and when it found
a byte above 127, it would flip to a double byte font.
This is because many Chinese fonts do (or did)
not include English characters, or included really
ugly ones. This was wrong for Japanese, and made the
half-width katakana unprintable. I found out that I
could control fonts if I printed one character at a
time with a special escape sequence, so wrote my own
bit-scanning code (tough in a language without ord()
or bitwise operations) to examine a string, classify
every byte, and control the fonts the way I wanted.
So a special subroutine is used for every name or
address field. This is apparently not unusual in GUI
development (especially web browsers) - you rarely
find a complete Unicode font, so you have to switch
fonts on the fly as you print a string.
After all of this, we had a working system and knew
quite a bit about encodings. Then the curve ball
arrived: User Defined Characters!
It is not true to say that there are exactly 6879
characters in Japanese, and more than counting the
number of languages on the Indian sub-continent or the
types of cheese in France. There are historical
variations and they evolve. Some people's names got
missed out, and others like to write a kanji in an
unusual way. Others arrived from China where they
have more complex variants of the same characters.
Despite the Japanese government's best attempts, these
people have dug their heels in and want to keep their
names the way they like them. My first reaction was
'Just Say No' - I basically said that it one of these
customers (14 out of a database of 8000) could show me
a tax form or phone bill with the correct UDC on it,
we would implement it but not otherwise (the usual
workaround is to spell their name phonetically in
katakana). But our marketing people put their foot
down.
A key factor is that Microsoft has 'extended the
standard' a few times. First of all, Microsoft and
IBM include an extra 360 characters in their code page
which are not in the JIS0208 standard. This is well
understood and most encoding toolkits know what 'Code
Page 932' is Shift-JIS plus a few extra characters.
Secondly, Shift-JIS has a User-Defined region of a
couple of thousand characters. They have lately been
taking Chinese variants of Japanese characters (which
are readable but a bit old-fashioned - I can imagine
pipe-smoking professors using these forms as an
affectation) and adding them into their standard
Windows fonts; so users are getting used to these
being available. These are not in a standard.
Thirdly, they include something called the 'Gaiji
Editor' in Japanese Win95, which lets you add new
characters to the fonts on your PC within the
user-defined region. The first step was to review all
the PCs in the Tokyo office, and get one centralized
extension font file on a server. This was also fun as
people had assigned different code points to
characters on differene machines, so what looked
correct on your word processor was a black square on
mine. Effectively, each company has its own custom
encoding a bit bigger than the standard.
Clearly, none of these extensions would convert
automatically to the other platforms.
Once we actually had an agreed list of code points, we
scanned the database by eye and made sure that the
relevant people were using them. We decided that
space for 128 User-Defined Characters would be
allowed. We thought we would need a wrapper around
Unilib to intercept these values and do a special
conversion; but to our amazement it worked! Somebody
had already figured out a mapping for at least 1000
characters for all the Japanes encodings, and they did
the round trips from Shift-JIS to Unicode to DBCS and
back. So the conversion problem needed less code than
we thought. This mapping is not defined in a standard
AFAIK (certainly not for DBCS anyway).
We did, however, need some really impressive
validation. When you input a name or address on any
of the platforms, the system should say
(a) is it valid for my encoding?
(b) will it fit in the available field space in the
other platforms?
(c) if it contains user-defined characters, are they
the ones we know about, or is this a new guy who will
require updates to our fonts etc.?
Finally, we got back to the display problems. Our
chosen range had a particular first byte. We built a
miniature font with the characters we needed starting
in the lower half of the code page. I then
generalized by name-printing routine to say 'if the
first character is XX, throw it away, and print the
subsequent character in our custom font'. This worked
beautifully - not only could we print everything, we
were using type 1 embedded fonts for the user defined
characters, so we could distill it and also capture it
for our internal document imaging systems.
So, that is roughly what is involved in building a
Japanese client reporting system that spans several
platforms.
I then moved over to the web team to work on our
online trading system for Japan, where I am now -
people will be able to open accounts and invest on the
web. The first stage was to prove it all worked.
With HTML, Java and the Web, I had high hopes, which
have mostly been fulfilled - we set an option in the
database connection to say 'this is a UTF8 database',
and Java converts it to Unicode when reading the
results, and we set another option saying 'the output
stream should be Shift-JIS' when we spew out the HTML.
There is one limitations: Java sticks to the JIS0208
standard, so the 360 extra IBM/Microsoft Kanji and our
user defined characters won't work on the web. You
cannot control the fonts on someone else's web
browser; management accepted this because we gave them
no alternative. Certain customers will need to be
warned, or asked to suggest a standard version of a
charactere if they want to see their name on the web.
I really hope the web actually brings character usage
in line with the standard in due course, as it will
save a fortune.
Our system is multi-language - when a customer logs
in, we want to say 'You are a Japanese customer of our
Tokyo Operation, so you see page X in language Y'.
The language strings all all kept in UTF8 in XML
files, so the same file can hold many languages. This
and the database are the real-world reasons why you
want to store stuff in UTF8. There are very few tools
to let you view UTF8, but luckily there is a free Word
Processor that lets you type Japanese and save it in
any encoding; so we can cut and paste between
Shift-JIS and UTF8 as needed.
And that's it. No climactic endings and a lot of real
world mess, just like life in IT. But hopefully this
gives you a feel for some of the practical stuff
internationalisation projects have to deal with. See
my other mail for actual suggestions
- Andy Robinson
=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.
__________________________________________________
Do You Yahoo!?
Bid and sell for free at http://auctions.yahoo.com
I got the wish list below. Anyone care to comment on how close we are
on fulfilling some or all of this?
--Guido van Rossum (home page: http://www.python.org/~guido/)
------- Forwarded Message
Date: Thu, 04 Nov 1999 20:26:54 +0700
From: "Claudio Ramón" <rmn70(a)hotmail.com>
To: guido(a)python.org
Hello,
I'm a python user (excuse my english, I'm spanish and...). I think it is a
very complete language and I use it in solve statistics, phisics,
mathematics, chemistry and biology problemns. I'm not an
experienced programmer, only a scientific with problems to solve.
The motive of this letter is explain to you a needs that I have in
the python use and I think in the next versions...
* GNU CC for Win32 compatibility (compilation of python interpreter and
"Freeze" utility). I think MingWin32 (Mummint Khan) is a good alternative
eviting the cygwin dll user.
* Add low level programming capabilities for system access and speed of code
fragments eviting the C-C++ or Java code use. Python, I think, must be a
complete programming language in the "programming for every body" philosofy.
* Incorporate WxWindows (wxpython) and/or Gtk+ (now exist a win32 port) GUI
in the standard distribution. For example, Wxpython permit an html browser.
It is very importan for document presentations. And Wxwindows and Gtk+ are
faster than tk.
* Incorporate a database system in the standard library distribution. To be
possible with relational and documental capabilites and with import facility
of DBASE, Paradox, MSAccess files.
* Incorporate a XML/HTML/Math-ML editor/browser with graphics capability (to
be possible with XML how internal file format). And to be possible with
Microsoft Word import export facility. For example, AbiWord project can be
an alternative but if lacks programming language. If we can make python the
programming language for AbiWord project...
Thanks.
Ramón Molina.
rmn70(a)hotmail.com
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com
------- End of Forwarded Message
Hi all...
I've updated some of the modules at http://www.lyra.org/greg/python/.
Specifically, there is a new httplib.py, davlib.py, qp_xml.py, and
a new imputil.py. The latter will be updated again RSN with some patches
from Jim Ahlstrom.
Besides some tweaks/fixes/etc, I've also clarified the ownership and
licensing of the things. httplib and davlib are (C) Guido, licensed under
the Python license (well... anything he chooses :-). qp_xml and imputil
are still Public Domain. I also added some comments into the headers to
note where they come from (I've had a few people remark that they ran
across the module but had no idea who wrote it or where to get updated
versions :-), and I inserted a CVS Id to track the versions (yes, I put
them into CVS just now).
Note: as soon as I figure out the paperwork or whatever, I'll also be
skipping the whole "wetsign.txt" thingy and just transfer everything to
Guido. He remarked a while ago that he will finally own some code in the
Python distribution(!) despite not writing it :-)
I might encourage others to consider the same...
Cheers,
-g
--
Greg Stein, http://www.lyra.org/
James C. Ahlstrom writes:
> Guido van Rossum wrote:
> > I got the wish list below. Anyone care to comment on how close we are
> > on fulfilling some or all of this?
>
> > * GNU CC for Win32 compatibility (compilation of python interpreter and
> > "Freeze" utility). I think MingWin32 (Mummint Khan) is a good alternative
> > eviting the cygwin dll user.
>
> I don't know what this means.
mingw32: 'minimalist gcc for win32'. it's gcc on win32 without trying
to be unix. It links against crtdll, so for example it can generate
small executables that run on any win32 platform. Also, an
alternative to plunking down money ever year to keep up with MSVC++
I used to use mingw32 a lot, and it's even possible to set up egcs to
cross-compile to it. At one point using egcs on linux I was able to
build a stripped-down python.exe for win32...
http://agnes.dida.physik.uni-essen.de/~janjaap/mingw32/
-Sam
I've OCR'd Saltzer's paper. It's available temporarily (in MS Word
format) at http://sirac.inrialpes.fr/~marangoz/tmp/Saltzer.zip
Since there may be legal problems with LNCS, I will disable the
link shortly (so those of you who have not received a copy and are
interested in reading it, please grab it quickly)
If prof. Saltzer agrees (and if he can, legally) put it on his web page,
I guess that the paper will show up at http://mit.edu/saltzer/
Jeremy, could you please check this with prof. Saltzer? (This version
might need some corrections due to the OCR process, despite that I've
made a significant effort to clean it up)
--
Vladimir MARANGOZOV | Vladimir.Marangozov(a)inrialpes.fr
http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252
I have for some time been wondering about the usefulness of this
mailing list. It seems to have produced staggeringly few results
since inception.
This is not a critisism of any individual, but of the process. It is
proof in my mind of how effective the benevolent dictator model is,
and how ineffective a language run by committee would be.
This "committee" never seems to be capable of reaching a consensus on
anything. A number of issues dont seem to provoke any responses. As
a result, many things seem to die a slow and lingering death. Often
there is lots of interesting discussion, but still precious few
results.
In the pre python-dev days, the process seemed easier - we mailed
Guido directly, and he either stated "yea" or "nay" - maybe we didnt
get the response we hoped for, but at least we got a response. Now,
we have the result that even if Guido does enter into a thread, the
noise seems to drown out any hope of getting anything done. Guido
seems to be faced with the dilemma of asserting his dictatorship in
the face of many dissenting opinions from many people he respects, or
putting it in the too hard basket. I fear the latter is the easiest
option. At the end of this mail I list some of the major threads over
the last few months, and can't see a single thread that has resulted
in a CVS checkin, and only one that has resulted in agreement. This,
to my mind at least, is proof that things are really not working.
I long for the "good old days" - take the replacement of "ni" with
built-in functionality, for example. I posit that if this was
discussed on python-dev, it would have caused a huge flood of mail,
and nothing remotely resembling a consensus. Instead, Guido simply
wrote an essay and implemented some code that he personally liked. No
debate, no discussion. Still an excellent result. Maybe not a
perfect result, but a result nonetheless.
However, Guido's time is becoming increasingly limited. So should we
consider moving to a "benevolent lieutenent" model, in conjunction
with re-ramping up the SIGS? This would provide 2 ways to get things
done:
* A new SIG. Take relative imports, for example. If we really do
need a change in this fairly fundamental area, a SIG would be
justified ("import-sig"). The responsibility of the SIG is to form a
consensus (and code that reflects it), and report back to Guido (and
the main newsgroup) with the result of this. It worked well for RE,
and allowed those of us not particularly interested to keep out of the
debate. If the SIG can not form consensus, then tough - it dies - and
should not be mourned. Presumably Guido would keep a watchful eye
over the SIG, providing direction where necessary, but in general stay
out of the day to day traffic. New SIGs seem to have stopped since
this list creation, and it seems that issues that should be discussed
in new SIGS are now discussed here.
* Guido could delegate some of his authority to a single individual
responsible for a certain limited area - a benevolent lieutenent. We
may have a lieutentant responsible for different areas, and could only
exercise their authority with small, trivial changes. Eg, the "getopt
helper" thread - if a lieutenant was given authority for the "standard
library", they could simply make a yea or nay decision, and present it
to Guido. Presumably Guido trusts this person he delegated to enough
that the majority of the lieutenant's recommendations would be
accepted. Presumably there would be a small number of lieutentants,
and they would then become the new "python-dev" - say up to 5 people.
This list then discusses high level strategies and seek direction from
each other when things get murky. This select group of people may not
(indeed, probably would not) include me, but I would have no problem
with that - I would prefer to see results achieved than have my own
ego stroked by being included in a select, but ineffective group.
In parting, I repeat this is not a direct critisism, simply an
observation of the last few months. I am on this list, so I am
definately as guilty as any one else - which is "not at all" - ie, no
one is guilty, I simply see it as endemic to a committee with people
of diverse backgrounds, skills and opinions.
Any thoughts?
Long live the dictator! :-)
Mark.
Recent threads, and my take on the results:
* getopt helper?
Too much noise regarding semantic changes.
* Alternative Approach to Relative Imports
* Relative package imports
* Path hacking
* Towards a Python based import scheme
Too much noise - no one could really agree on the semantics.
Implementation thrown in the ring, and promptly forgotten.
* Corporate installations
Very young, but no result at all.
* Embedding Python when using different calling conventions
Quite young, but no result as yet, and I have no reason to believe
there will be.
* Catching "return" and "return expr" at compile time
Seemed to be blessed - yay! Dont believe I have seen a check-in yet.
* More Python command-line features
Seemed general agreement, but nothing happened?
* Tackling circular dependencies in 2.0?
Lots of noise, but no results other than "GC may be there in 2.0"
* Buffer interface in abstract.c
Determined it could break - no solution proposed. Lots of noise
regarding if is is a good idea at all!
* mmapfile module
No result.
* Quick-and-dirty weak references
No result.
* Portable "spawn" module for core?
No result.
* Fake threads
Seemed to spawn stackless Python, but in the face of Guido being "at
best, lukewarm" about this issue, I would again have to conclude "no
result". An authorative "no" in this area may have saved lots of
effort and heartache.
* add Expat to 1.6
No result.
* I'd like list.pop to accept an optional second argument giving a
default value
No result
* etc
No result.
[Extracted from the psa-members list...]
Gordon McMillan wrote:
>
> Chris Fama wrote,
> > And now the rub: the exact same function definition has passed
> > through byte-compilation perfectly OK many times before with no
> > problems... of course, this points rather clearly to the
> > preceding code, but it illustrates a failing in Python's syntax
> > error messages, and IMHO a fairly serious one at that, if this is
> > indeed so.
>
> My simple experiments refuse to compile a "del getattr(..)" at
> all.
Hmm, it seems to be a failry generic error:
>>> del f(x,y)
SyntaxError: can't assign to function call
How about chainging the com_assign_trailer function in Python/compile.c
to:
static void
com_assign_trailer(c, n, assigning)
struct compiling *c;
node *n;
int assigning;
{
REQ(n, trailer);
switch (TYPE(CHILD(n, 0))) {
case LPAR: /* '(' [exprlist] ')' */
com_error(c, PyExc_SyntaxError,
assigning ? "can't assign to function call":
"can't delete expression");
break;
case DOT: /* '.' NAME */
com_assign_attr(c, CHILD(n, 1), assigning);
break;
case LSQB: /* '[' subscriptlist ']' */
com_subscriptlist(c, CHILD(n, 1), assigning);
break;
default:
com_error(c, PyExc_SystemError, "unknown trailer type");
}
}
or something along those lines...
BTW, has anybody tried my import patch recently ? I haven't heard
any citicism since posting it and wonder what made the list fall
asleep over the topic :-)
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 61 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/