[Doc-SIG] State of xml documentation standart project.

Brian Quinlan brian@sweetapp.com
Fri, 12 Apr 2002 09:56:15 -0700


This is a multi-part message in MIME format.

------=_NextPart_000_003C_01C1E208.49D84310
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit

Vivian wrote:
> I'am sorry not to be able to find what's happening with the xml
migration
> project and I hope not bothering you with question I should be able to
> answer alone.

In order to avoid learning LaTeX for my Python extensions, I wrote a
simple XSLT stylesheet that converts my made-up XML grammar to html.

I've attached the XML, stylesheet and resulting HTML. 

I too would be winning to help move us towards a standard SGML
documentation solution.

Cheers,
Brian



------=_NextPart_000_003C_01C1E208.49D84310
Content-Type: text/xml;
	name="PySilverCityDocs.xml"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="PySilverCityDocs.xml"

<section>
   =20
<declarepackage>SilverCity</declarepackage>

<heading>
<package>SilverCity</package>=20
    --- Multilanguage lexical analysis package
</heading>

<p>SilverCity is a library that can provide lexical analysis for over 20 =
different=20
programming languages. SilverCity is packaged as both a C++ library and =
as a=20
Python extension. This documentation applies to the Python =
extension.</p>
<p>At this point I'd like to acknoledge that this documentation is =
incomplete. Writting
isn't a hobby of mine. So if you need any help, just let me know at =
&lt;brian@sweetapp.com&gt;.</p>
<contents/>

<seealso>
    <see>
        <ref href=3D"http://www.scintilla.org/"/>
        <title>Scintilla</title>
        <description>Scintilla is the open-source source editing =
component
        upon which SilverCity is built</description>
    </see>
    <see>
        <ref =
href=3D"http://www.python.org/doc/current/lib/module-tokenize.html"/>
        <title>Python tokenize module</title>
        <description>Python's built-in lexical scanner for Python source =
code.
        </description>=0D    </see>
</seealso>

<subsection>
<title>Module Contents</title>
<funcdesc>
    <name>find_lexer_module_by_id</name>
    <arguments>
        <argument>id</argument>
    </arguments>
    <description>
        The <function>find_lexer_module_by_id</function> function
        returns a <class>LexerModule</class> object given an integer
        constant. These constants are defined in the=20
        <refmodule>ScintillaConstants</refmodule> module.
        <p>A <exception>ValueError</exception> is raised if the
        provided constant does not map to a =
<class>LexerModule</class>.</p>
    </description>
</funcdesc>

<funcdesc>
    <name>find_lexer_module_by_name</name>
    <arguments>
        <argument>name</argument>
    </arguments>
    <description>
        The <function>find_lexer_module_by_name</function> function
        returns a <class>LexerModule</class> object given it's name.
        <p>A <exception>ValueError</exception> is raised if no =
<class>LexerModule</class>
        has the given name</p>
    </description>
</funcdesc>

<classconstructor>
    <name>WordList</name>
    <arguments>
        <argument default=3D"yes">keywords</argument>
    </arguments>
    <description>
        Create a new <class>WordList</class> instance.=20
        This class is used by the=20
        <class>LexerModule</class> class to determine which
        words should be lexed as keywords.
        <p>
        <var>keywords</var> should be
        a string containing keywords separated by spaces=20
        e.g. "and assert break class..."
        </p>
        <p><class>WordList</class> objects have no methods. They simply =
act as placeholders for=20
        language keywords.</p>
    </description>
</classconstructor>

<classconstructor>
    <name>PropertySet</name>
    <arguments>
        <argument default=3D"yes">properties</argument>
    </arguments>
    <description>
        Create a new <class>PropertySet</class> instance.=20
        This class is used by the=20
        <class>LexerModule</class> class to determine various
        lexer options. For example, the 'styling.within.preprocessor'
        property determines if the C lexer should use a single or
        multiple lexical states when parsing C preprocessor expressions.
        <p>
        <var>properties</var> should be
        a dictionary of lexer options.
        </p>
    </description>
</classconstructor>
</subsection>
<subsection>
<title>LexerModule objects</title>
<p>The <class>LexerModule</class> class provides a single method:</p>
<methoddesc>
    <name>get_number_of_wordlists</name>
    <arguments>
    </arguments>
    <description>
        Returns the number of <class>WordLists</class> that the lexer =
requires.=20
        This is the number of <class>WordLists</class> that must be
        passed to the <method>tokenize_by_style</method>.
        <p>If the number of required WordLists cannot be determined, a =
ValueError,
        is raised</p>
    </description>
</methoddesc>

<methoddesc>
    <name>tokenize_by_style</name>
    <arguments>
        <argument>source</argument>
        <argument>keywords</argument>
        <argument>propertyset</argument>
        <argument default=3D"yes">func</argument>
    </arguments>
    <description>
        Lexes the provided source code into a list of tokens. Each token =
is a dictionary with the following
        keys:
        <p>
            <table>
                <thead><th>Key</th><th>Value</th></thead>
                <tbody>
                    <tr><td>style</td><td>The lexical style of the token =
e.g. 11</td></tr>
                    <tr><td>text</td><td>The text of the token e.g. =
'import'</td></tr>
                    <tr><td>start_index</td><td>The index in =
<var>source</var> where the token begins e.g. 0</td></tr>
                    <tr><td>end_index</td><td>The index in =
<var>source</var> where the token ends e.g. 5</td></tr>
                    <tr><td>start_column</td><td>The column position =
(0-based) where the token begins e.g. 0</td></tr>
                    <tr><td>end_column</td><td>The column position =
(0-based) where the token ends e.g. 5</td></tr>
                    <tr><td>start_line</td><td>The line position =
(0-based) where the token begins e.g. 0</td></tr>
                    <tr><td>end_line</td><td>The line position (0-based) =
where the token ends e.g. 0</td></tr>
                </tbody>
            </table>       =20
        </p>
       =20
        <p><var>source</var> is a string containing the source code.

        <var>keywords</var> is a list of <class>WordList</class> =
instances.
        The number of <class>WordLists</class> that should be passed =
depends on the particular=20
        <class>LexerModule</class>.
       =20
        <var>propertyset</var> is a <class>PropertySet</class> instance.
        The relevant properties are dependant on the particular =
<class>LexerModule</class>.</p>

        <p>If the optional <var>func</var> argument is used, it must be =
a callable object. It will
        be called, using keyword arguments, for each token found in the =
source. Since additional
        keys may be added in the future, it is recommended that =
additional keys be collected e.g.
        <example>
            import SilverCity
            from SilverCity import ScintillaConstants
           =20
            def func(style, text, start_column, start_line, =
**other_args):=20
                if style =3D=3D ScintillaConstants.SCE_P_WORD and text =
=3D=3D 'import':
                    print 'Found an import statement at (%d, %d)' % =
(start_line + 1, start_column + 1)
           =20
           =20
            keywords =3D =
SilverCity.WordList(SilverCity.Keywords.python_keywords)
            properties =3D SilverCity.PropertySet()
            lexer =3D =
SilverCity.find_lexer_module_by_id(ScintillaConstants.SCLEX_PYTHON)
           =20
            lexer.tokenize_by_style(source_code, keywords, properties, =
func)
        </example></p>

    </description>
</methoddesc>
</subsection>

<subsection>
<title>WordList objects</title>
<p><class>WordList</class> objects have no methods. They simply act as =
placeholders for=20
language keywords.</p>
</subsection>

<subsection>
<title>PropertySet objects</title>
<p><class>PropertySet</class> objects have no methods. They act as =
dictionaries were the
names of the properties are the keys. All keys must be strings, values =
will be converted to strings
upon assignment i.e. retrieved values will always be strings. There is =
no mechanism to delete
assigned keys.</p>
<p>Different properties apply to different languages. The following =
table is a complete
list of properties, the language that they apply to, and their =
meanings:</p>

<table>
    <thead><th>Property</th><th>Language</th><th>Values</th></thead>
    <tbody>
        <tr><td>asp.default.language</td><td>HTML</td>
<td>Sets the default language for ASP scripts:<br/>
0 =3D> None<br/>
1 =3D> JavaScript<br/>
2 =3D> VBScript<br/>
3 =3D> Python<br/>
4 =3D> PHP<br/>
5 =3D> XML-based<br/>
</td></tr>
    <tr><td>styling.within.preprocessor</td><td>C++</td>
<td>Determines if all preprocessor instruments should be lexed =
identically or
if subexpressions should be given different lexical states:<br/>
0 =3D> Same<br/>
1 =3D> Different<br/>
</td></tr>
    <tr><td>tab.timmy.whinge.level</td><td>Python</td>
<td>The property value is a bitfield that causes different types of  =
incorrect=20
whitespace characters to cause there lexical states to be incremeted by =
64:<br/>
0 =3D> no checking<br/>
1 =3D> check for correct indenting<br/>
2 =3D> check for literal tabs<br/>
4 =3D> check for literal spaces used as tabs<br/>
8 =3D> check for mixed tabs and spaces<br/>
</td></tr>
    </tbody>
</table>
<p>Example <class>PropertySet</class> usage:
<example>
    import SilverCity
   =20
    propset =3D SilverCity.PropertySet({'styling.within.preprocessor' : =
0})
    propset['styling.within.preprocessor'] =3D 1 # changed my mind
</example>
</p>
</subsection>

<subsection>
<title>Stuff that should be documented better</title>
<p>The <module>ScintillaConstants</module> module contains a list of =
lexer identifiers
(used by <function>find_lexer_module_by_id</function>) and lexical =
states for each
<class>LexerModule</class>. You should take a look at this module to =
find the states
that are useful for your programming language.</p>
<p>The <module>Keywords</module> module contains lists of keywords that =
can be
used to create <class>WordList</class> objects.</p>
<p>There are also some modules that package =
<function>tokenize_by_style</function>
into a class that offers a visitor pattern (think SAX). You don't have =
to worry about these
modules if you don't want to. But, if you do, they are all written in =
Python so you can probably
muddle through.</p>
<p>Note that some lexer that are supported by Scintilla, are not =
supported by=20
<package>SilverCity</package>. This is because I am lazy. Any =
contributions are welcome
(and should be pretty easy to make).</p>
</subsection>

</section>
------=_NextPart_000_003C_01C1E208.49D84310
Content-Type: text/xml;
	name="doc-template.xsl"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="doc-template.xsl"

<xsl:stylesheet=20
	version=3D"1.0"
	xmlns:xsl=3D"http://www.w3.org/1999/XSL/Transform">=09

<xsl:template match=3D"subsection/title" mode=3D"title"><a =
href=3D"#{generate-id()}"><xsl:value-of select=3D"."/></a>
</xsl:template>

<xsl:template match=3D"subsection" mode=3D"index">
	<li><xsl:apply-templates select=3D"title" mode=3D"title"/></li>
	<ul><xsl:apply-templates select=3D"child::subsection" =
mode=3D"index"/></ul>
</xsl:template>

<xsl:template match=3D"br">
    <br/>
</xsl:template>

<xsl:template match=3D"/">
	<html>
		<head>
                <!-- XXX hard coded title -->
			<title>SilverCity</title>
			<link rel=3D"STYLESHEET" =
href=3D"http://www.python.org/doc/current/lib/lib.css"/>
		</head>
		<body>
			<xsl:apply-templates/>
		</body>
	</html>
</xsl:template>

<xsl:template match=3D"declarepackage"/>

<xsl:template match=3D"xxx">
	XXX - <em><xsl:apply-templates/></em>
</xsl:template>

<xsl:template match=3D"contents">
	<strong>Subsections</strong>
	<ul><xsl:apply-templates select=3D"//subsection" mode=3D"index"/></ul>
</xsl:template>

<xsl:template match=3D"heading">
	<h1><xsl:apply-templates/></h1>
</xsl:template>

<xsl:template match=3D"var">
	<var><xsl:apply-templates/></var>
</xsl:template>

<xsl:template match=3D"example">
	<dl>
		<dd><dt/><pre class=3D"verbatim"><xsl:apply-templates/></pre></dd>
	</dl>
</xsl:template>

<xsl:template match=3D"module|package|function|exception|class|code">
	<tt><xsl:apply-templates/></tt>
</xsl:template>

<xsl:template match=3D"p">
	<p><xsl:apply-templates/></p>
</xsl:template>

<xsl:template match=3D"see">
	<dl compact=3D"true" class=3D"seetitle">
		<dt>
		<em class=3D"citetitle">
		<a>
			<xsl:attribute name=3D"href"><xsl:value-of =
select=3D"ref/attribute::href"/></xsl:attribute>
			<xsl:value-of select=3D"title"/>
		</a>
		</em>
		</dt>
		<dd>
			<xsl:apply-templates select=3D"description"/>
		</dd>
	</dl>
</xsl:template>

<xsl:template match=3D"seealso">
	<div class=3D"seealso">
		<p class=3D"heading"><b>See Also:</b></p>
		<xsl:apply-templates select=3D"see"/>
	</div>
</xsl:template>

<xsl:template match=3D"funcdesc|methoddesc">
	<dl>
		<dt>
			<b><a name=3D""><tt class=3D"function"><xsl:value-of =
select=3D"name"/></tt></a></b>
			<xsl:apply-templates select=3D"arguments"/>
		</dt>
		<dd><xsl:apply-templates select=3D"description"/></dd>
	</dl>
</xsl:template>

<xsl:template match=3D"classconstructor">
	<dl>
		<dt>
			<b><a name=3D""><span class=3D"typelabel">class </span><tt =
class=3D"class"><xsl:value-of select=3D"name"/></tt></a></b>
			<xsl:apply-templates select=3D"arguments"/>
		</dt>
		<dd><xsl:apply-templates select=3D"description"/></dd>
	</dl>
</xsl:template>

<xsl:template match=3D"arguments">
	(<xsl:for-each select=3D"argument">
		<xsl:if test=3D"attribute::default=3D'yes'"><big>[</big></xsl:if>
		<xsl:if test=3D"position() > 1">, </xsl:if>
		<var><xsl:apply-templates/></var>
		<xsl:if test=3D"count(attribute::default)"><big>]</big></xsl:if>
	</xsl:for-each>)
</xsl:template>

<xsl:template match=3D"table">
	<table border=3D"yes" align=3D"center" style=3D"border-collapse: =
collapse">
		<xsl:apply-templates/>
	</table>
</xsl:template>

<xsl:template match=3D"subsection/title">
	<a name=3D"{generate-id()}"><h2><xsl:apply-templates/></h2></a>
</xsl:template>

<xsl:template match=3D"tr">
	<tr><xsl:apply-templates/></tr>
</xsl:template>

<xsl:template match=3D"thead">
	<thead><tr class=3D"tableheader"><xsl:apply-templates/></tr></thead>
</xsl:template>

<xsl:template match=3D"tbody">
	<tbody valign=3D"baseline"><xsl:apply-templates/></tbody>
</xsl:template>

<xsl:template match=3D"th">
	<th align=3D"left"><b><xsl:apply-templates/></b></th>
</xsl:template>

<xsl:template match=3D"td">
	<td align=3D"left" valign=3D"baseline"><xsl:apply-templates/></td>
</xsl:template>
</xsl:stylesheet>
------=_NextPart_000_003C_01C1E208.49D84310
Content-Type: text/html;
	name="docs.html"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="docs.html"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>SilverCity</TITLE>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3DUTF-8"><LINK=20
href=3D"docs_files/lib.css" rel=3DSTYLESHEET>
<META content=3D"MSHTML 5.50.4616.200" name=3DGENERATOR></HEAD>
<BODY>
<H1><TT>SilverCity</TT> --- Multilanguage lexical analysis package </H1>
<P>SilverCity is a library that can provide lexical analysis for over 20 =

different programming languages. SilverCity is packaged as both a C++ =
library=20
and as a Python extension. This documentation applies to the Python=20
extension.</P>
<P>At this point I'd like to acknoledge that this documentation is =
incomplete.=20
Writting isn't a hobby of mine. So if you need any help, just let me =
know at=20
&lt;brian@sweetapp.com&gt;.</P><STRONG>Subsections</STRONG>=20
<UL>
  <LI><A=20
  =
href=3D"file:///C:/Documents%20and%20Settings/Administrator/Local%20Setti=
ngs/Temp/~2060-0.html#N0.33">Module=20
  Contents</A>=20
  <UL></UL>
  <LI><A=20
  =
href=3D"file:///C:/Documents%20and%20Settings/Administrator/Local%20Setti=
ngs/Temp/~2060-0.html#N0.B5">LexerModule=20
  objects</A>=20
  <UL></UL>
  <LI><A=20
  =
href=3D"file:///C:/Documents%20and%20Settings/Administrator/Local%20Setti=
ngs/Temp/~2060-0.html#N0.159">WordList=20
  objects</A>=20
  <UL></UL>
  <LI><A=20
  =
href=3D"file:///C:/Documents%20and%20Settings/Administrator/Local%20Setti=
ngs/Temp/~2060-0.html#N0.164">PropertySet=20
  objects</A>=20
  <UL></UL>
  <LI><A=20
  =
href=3D"file:///C:/Documents%20and%20Settings/Administrator/Local%20Setti=
ngs/Temp/~2060-0.html#N0.1C4">Stuff=20
  that should be documented better</A>=20
  <UL></UL></LI></UL>
<DIV class=3Dseealso>
<P class=3Dheading><B>See Also:</B> </P>
<DL class=3Dseetitle compact>
  <DT><EM class=3Dcitetitle><A =
href=3D"http://www.scintilla.org/">Scintilla</A></EM>=20

  <DD>Scintilla is the open-source source editing component upon which=20
  SilverCity is built </DD></DL>
<DL class=3Dseetitle compact>
  <DT><EM class=3Dcitetitle><A=20
  =
href=3D"http://www.python.org/doc/current/lib/module-tokenize.html">Pytho=
n=20
  tokenize module</A></EM>=20
  <DD>Python's built-in lexical scanner for Python source code. =
</DD></DL></DIV><A=20
name=3DN0.33>
<H2>Module Contents</H2></A>
<DL>
  <DT><B><A name=3D""><TT =
class=3Dfunction>find_lexer_module_by_id</TT></A></B>=20
  (<VAR>id</VAR>)=20
  <DD>The <TT>find_lexer_module_by_id</TT> function returns a=20
  <TT>LexerModule</TT> object given an integer constant. These constants =
are=20
  defined in the ScintillaConstants module.=20
  <P>A <TT>ValueError</TT> is raised if the provided constant does not =
map to a=20
  <TT>LexerModule</TT>.</P></DD></DL>
<DL>
  <DT><B><A name=3D""><TT =
class=3Dfunction>find_lexer_module_by_name</TT></A></B>=20
  (<VAR>name</VAR>)=20
  <DD>The <TT>find_lexer_module_by_name</TT> function returns a=20
  <TT>LexerModule</TT> object given it's name.=20
  <P>A <TT>ValueError</TT> is raised if no <TT>LexerModule</TT> has the =
given=20
  name</P></DD></DL>
<DL>
  <DT><B><A name=3D""><SPAN class=3Dtypelabel>class </SPAN><TT=20
  class=3Dclass>WordList</TT></A></B>=20
  (<BIG>[</BIG><VAR>keywords</VAR><BIG>]</BIG>)=20
  <DD>Create a new <TT>WordList</TT> instance. This class is used by the =

  <TT>LexerModule</TT> class to determine which words should be lexed as =

  keywords.=20
  <P><VAR>keywords</VAR> should be a string containing keywords =
separated by=20
  spaces e.g. "and assert break class..." </P>
  <P><TT>WordList</TT> objects have no methods. They simply act as =
placeholders=20
  for language keywords.</P></DD></DL>
<DL>
  <DT><B><A name=3D""><SPAN class=3Dtypelabel>class </SPAN><TT=20
  class=3Dclass>PropertySet</TT></A></B>=20
  (<BIG>[</BIG><VAR>properties</VAR><BIG>]</BIG>)=20
  <DD>Create a new <TT>PropertySet</TT> instance. This class is used by =
the=20
  <TT>LexerModule</TT> class to determine various lexer options. For =
example,=20
  the 'styling.within.preprocessor' property determines if the C lexer =
should=20
  use a single or multiple lexical states when parsing C preprocessor=20
  expressions.=20
  <P><VAR>properties</VAR> should be a dictionary of lexer options.=20
</P></DD></DL><A name=3DN0.B5>
<H2>LexerModule objects</H2></A>
<P>The <TT>LexerModule</TT> class provides a single method:</P>
<DL>
  <DT><B><A name=3D""><TT =
class=3Dfunction>get_number_of_wordlists</TT></A></B> ()=20
  <DD>Returns the number of <TT>WordLists</TT> that the lexer requires. =
This is=20
  the number of <TT>WordLists</TT> that must be passed to the =
tokenize_by_style.=20

  <P>If the number of required WordLists cannot be determined, a =
ValueError, is=20
  raised</P></DD></DL>
<DL>
  <DT><B><A name=3D""><TT =
class=3Dfunction>tokenize_by_style</TT></A></B>=20
  (<VAR>source</VAR>, <VAR>keywords</VAR>, =
<VAR>propertyset</VAR><BIG>[</BIG>,=20
  <VAR>func</VAR><BIG>]</BIG>)=20
  <DD>Lexes the provided source code into a list of tokens. Each token =
is a=20
  dictionary with the following keys:=20
  <P>
  <TABLE style=3D"BORDER-COLLAPSE: collapse" align=3Dcenter =
border=3Dyes>
    <THEAD>
    <TR class=3Dtableheader>
      <TH align=3Dleft><B>Key</B></TH>
      <TH align=3Dleft><B>Value</B></TH></TR></THEAD>
    <TBODY vAlign=3Dbaseline>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>style</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The lexical style of the token =
e.g.=20
    11</TD></TR>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>text</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The text of the token e.g. =
'import'</TD></TR>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>start_index</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The index in <VAR>source</VAR> =
where the=20
        token begins e.g. 0</TD></TR>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>end_index</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The index in <VAR>source</VAR> =
where the=20
        token ends e.g. 5</TD></TR>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>start_column</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The column position (0-based) =
where the=20
        token begins e.g. 0</TD></TR>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>end_column</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The column position (0-based) =
where the=20
        token ends e.g. 5</TD></TR>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>start_line</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The line position (0-based) =
where the=20
        token begins e.g. 0</TD></TR>
    <TR>
      <TD vAlign=3Dbaseline align=3Dleft>end_line</TD>
      <TD vAlign=3Dbaseline align=3Dleft>The line position (0-based) =
where the=20
        token ends e.g. 0</TD></TR></TBODY></TABLE></P>
  <P><VAR>source</VAR> is a string containing the source code.=20
  <VAR>keywords</VAR> is a list of <TT>WordList</TT> instances. The =
number of=20
  <TT>WordLists</TT> that should be passed depends on the particular=20
  <TT>LexerModule</TT>. <VAR>propertyset</VAR> is a <TT>PropertySet</TT> =

  instance. The relevant properties are dependant on the particular=20
  <TT>LexerModule</TT>.</P>
  <P>If the optional <VAR>func</VAR> argument is used, it must be a =
callable=20
  object. It will be called, using keyword arguments, for each token =
found in=20
  the source. Since additional keys may be added in the future, it is=20
  recommended that additional keys be collected e.g.=20
  <DL>
    <DD>
    <DT><PRE class=3Dverbatim>            import SilverCity
            from SilverCity import ScintillaConstants
           =20
            def func(style, text, start_column, start_line, =
**other_args):=20
                if style =3D=3D ScintillaConstants.SCE_P_WORD and text =
=3D=3D 'import':
                    print 'Found an import statement at (%d, %d)' % =
(start_line + 1, start_column + 1)
           =20
           =20
            keywords =3D =
SilverCity.WordList(SilverCity.Keywords.python_keywords)
            properties =3D SilverCity.PropertySet()
            lexer =3D =
SilverCity.find_lexer_module_by_id(ScintillaConstants.SCLEX_PYTHON)
           =20
            lexer.tokenize_by_style(source_code, keywords, properties, =
func)
        </PRE></DT></DL>
  <P></P></DD></DL><A name=3DN0.159>
<H2>WordList objects</H2></A>
<P><TT>WordList</TT> objects have no methods. They simply act as =
placeholders=20
for language keywords.</P><A name=3DN0.164>
<H2>PropertySet objects</H2></A>
<P><TT>PropertySet</TT> objects have no methods. They act as =
dictionaries were=20
the names of the properties are the keys. All keys must be strings, =
values will=20
be converted to strings upon assignment i.e. retrieved values will =
always be=20
strings. There is no mechanism to delete assigned keys.</P>
<P>Different properties apply to different languages. The following =
table is a=20
complete list of properties, the language that they apply to, and their=20
meanings:</P>
<TABLE style=3D"BORDER-COLLAPSE: collapse" align=3Dcenter border=3Dyes>
  <THEAD>
  <TR class=3Dtableheader>
    <TH align=3Dleft><B>Property</B></TH>
    <TH align=3Dleft><B>Language</B></TH>
    <TH align=3Dleft><B>Values</B></TH></TR></THEAD>
  <TBODY vAlign=3Dbaseline>
  <TR>
    <TD vAlign=3Dbaseline align=3Dleft>asp.default.language</TD>
    <TD vAlign=3Dbaseline align=3Dleft>HTML</TD>
    <TD vAlign=3Dbaseline align=3Dleft>Sets the default language for ASP =

      scripts:<BR>0 =3D&gt; None<BR>1 =3D&gt; JavaScript<BR>2 =3D&gt; =
VBScript<BR>3=20
      =3D&gt; Python<BR>4 =3D&gt; PHP<BR>5 =3D&gt; =
XML-based<BR></TD></TR>
  <TR>
    <TD vAlign=3Dbaseline align=3Dleft>styling.within.preprocessor</TD>
    <TD vAlign=3Dbaseline align=3Dleft>C++</TD>
    <TD vAlign=3Dbaseline align=3Dleft>Determines if all preprocessor =
instruments=20
      should be lexed identically or if subexpressions should be given =
different=20
      lexical states:<BR>0 =3D&gt; Same<BR>1 =3D&gt; =
Different<BR></TD></TR>
  <TR>
    <TD vAlign=3Dbaseline align=3Dleft>tab.timmy.whinge.level</TD>
    <TD vAlign=3Dbaseline align=3Dleft>Python</TD>
    <TD vAlign=3Dbaseline align=3Dleft>The property value is a bitfield =
that=20
      causes different types of incorrect whitespace characters to cause =
there=20
      lexical states to be incremeted by 64:<BR>0 =3D&gt; no =
checking<BR>1 =3D&gt;=20
      check for correct indenting<BR>2 =3D&gt; check for literal =
tabs<BR>4 =3D&gt;=20
      check for literal spaces used as tabs<BR>8 =3D&gt; check for mixed =
tabs and=20
      spaces<BR></TD></TR></TBODY></TABLE>
<P>Example <TT>PropertySet</TT> usage:=20
<DL>
  <DD>
  <DT><PRE class=3Dverbatim>    import SilverCity
   =20
    propset =3D SilverCity.PropertySet({'styling.within.preprocessor' : =
0})
    propset['styling.within.preprocessor'] =3D 1 # changed my mind
</PRE></DT></DL>
<P></P><A name=3DN0.1C4>
<H2>Stuff that should be documented better</H2></A>
<P>The <TT>ScintillaConstants</TT> module contains a list of lexer =
identifiers=20
(used by <TT>find_lexer_module_by_id</TT>) and lexical states for each=20
<TT>LexerModule</TT>. You should take a look at this module to find the =
states=20
that are useful for your programming language.</P>
<P>The <TT>Keywords</TT> module contains lists of keywords that can be =
used to=20
create <TT>WordList</TT> objects.</P>
<P>There are also some modules that package <TT>tokenize_by_style</TT> =
into a=20
class that offers a visitor pattern (think SAX). You don't have to worry =
about=20
these modules if you don't want to. But, if you do, they are all written =
in=20
Python so you can probably muddle through.</P>
<P>Note that some lexer that are supported by Scintilla, are not =
supported by=20
<TT>SilverCity</TT>. This is because I am lazy. Any contributions are =
welcome=20
(and should be pretty easy to make).</P></BODY></HTML>

------=_NextPart_000_003C_01C1E208.49D84310--