[Edu-sig] whitespace and newlines and seperators ... oh my
ajsiegel at optonline.net
Fri Feb 3 16:02:11 CET 2006
More (semi-IMO) irrelevancy ....
Trying to get to a presentable, announceable alpha release of PyGeo...
and ending up confronting new complexities of the kind I want to just
Alan Kay hasn't gotten there yet, so I am stuck with the prospect of
understanding things and thinking things through.
OK - watch my directory separators in HTML. HTML that works on my local
Windows box breaks on the server. Think Unix separators. Got it..
Then there is of course the cross platform newline issue - CRLF, CR, LF
which gets me doing search and replace in ascii hex (geeky) and screwing
up working code.
What's the standard way to deal with this issue in a cross platform
distribution?. Still don't know.
But then it gets hairy, and Python specific:
Trying to use the pre-alpha pudge document generator for the portion of
PyGeo I consider to be the underlying "framework".
Choosing pudge because it supports embedded reStructured text, and it
supports "code as text" - which I see as part of PyGeo at a basic level
- by automating the linking of documentation to colorized, html versions
of the actual code. Very cool.
There seems to be a group of heavyweights behind it.
And the code base is small, I can follow it - so when it breaks when
confronting a Boost-wrapped cvisual function I can find the work around..
First bigger problem is that pudge chokes on what I think is valid
reStructured text - finding tables that pass reStructured scrutiny in
stand-alone files to be malformed when embedded in triple quoted doc
comments. I am assuming this is some kind of whitespace parsing issue,
but haven't dug into the code far enough to verify. I am hoping it is
something I can solve and feed back into the pudge project. Remains to
More surprising was the html colorizing problem. The colorizing code
relies on tokenize.py from the standard library - which keeps choking on
code that compiles and runs fine by Python. So I go to tabnanny.py,
which is seems to be there exactly to diagnose these kinds of issues.
But one of the symptoms of the problem is that tabnanny (i.e tokenize)
is parsing the file in such a way that it is reporting back line numbers
that don't correspond to the code when viewed in a text editor.
So its hard to pinpoint the problem.
Turns out (I think) - this took be a while - that tokenize seems to be
trying to parse things between triple quoted strings, and since there is
a lot of code intended to output to Povray SDL and formatted for that
purpose - it is choking on whitespace issues (that IMO shouldn't be
issues) in that code.
Is it fair to think that all rules should be off between """ ...and...
""" - and that if I am right that this is where the choke is, that I
should file a bug report.
OTOH, seems unlikely that this has not been confronted before.
Any clues to what I may be missing is appreciated.
More information about the Edu-sig