[Tutor] html color coding: where to start

Kent Johnson kent37 at tds.net
Fri Sep 18 01:24:42 CEST 2009


2009/9/17 Emad Nawfal (عماد نوفل) <emadnawfal at gmail.com>:
> Hi Tutors,
> I want to color-code the different parts of the word in a morphologically
> complex natural language. The file I have looks like this, where the fisrt
> column is the word, and the  second is the composite part of speech tag. For
> example, Al is a DETERMINER, wlAy is a NOUN and At is a PLURAL NOUN SUFFIX
>
> Al+wlAy+At        DET+NOUN+NSUFF_FEM_PL
> Al+mtHd+p        DET+ADJ+NSUFF_FEM_SG
>
> The output I want is one on which the word has no plus signs, and each
> segment is color-coded with a grammatical category. For example, the noun is
> red, the det is green, and the suffix is orange.  Like on this page here:
> http://docs.google.com/View?id=df7jv9p9_3582pt63cc4

Here is an example that duplicates your google doc and generates
fairly clean, idiomatic HTML. It requires the HTML generation package
from
http://pypi.python.org/pypi/html/1.4

from html import HTML

lines = '''
Al+wlAy+At        DET+NOUN+NSUFF_FEM_PL
Al+mtHd+p        DET+ADJ+NSUFF_FEM_SG
'''.splitlines()

# Define colors in a CSS stylesheet
styles = '''
.NOUN
{color: red }

.ADJ
{color: brown }

.DET
{color: green}

.NSUFF_FEM_PL, .NSUFF_FEM_SG
{color: blue}
'''

h = HTML()

with h.html:
    with h.head:
        h.title("Example")
        h.style(styles)

    with h.body(newlines=True):
        for line in lines:
            line = line.split()
            if len(line) != 2: continue
            word = line[0]
            pos = line[1]
            zipped = zip(word.split("+"), pos.split("+"))

            for part, kind in zipped:
                h.span(part, klass=kind)
            h.br

print h


Kent


More information about the Tutor mailing list