[Tutor] html color coding: where to start

bob gailer bgailer at gmail.com
Thu Sep 17 19:36:41 CEST 2009

Emad Nawfal (عماد نوفل) wrote:
> Hi Tutors,
> I want to color-code the different parts of the word in a 
> morphologically complex natural language. The file I have looks like 
> this, where the fisrt column is the word, and the  second is the 
> composite part of speech tag. For example, Al is a DETERMINER, wlAy is 
> Al+wlAy+At        DET+NOUN+NSUFF_FEM_PL
> Al+mtHd+p        DET+ADJ+NSUFF_FEM_SG
> The output I want is one on which the word has no plus signs, and each 
> segment is color-coded with a grammatical category. For example, the 
> noun is red, the det is green, and the suffix is orange.  Like on this 
> page here:
> http://docs.google.com/View?id=df7jv9p9_3582pt63cc4
> I am stuck with the html part and I don't know where to start. I have 
> no experience with html, but I have this skeleton (which may not be 
> the right thing any way)
> Any help with materials, modules, suggestions appreciated.
> This skeleton of my program is as follows:
> #############
> RED = ("NOUN", "ADJ")
> GREEN = ("DET", "DEMON")

Instead of that use a dictionary:

colors = dict(NOUN="RED", ADJ="RED",DET ="GREEn",DEMON ="GREEN",
                      NSUFF="ORANGE", VSUFF="ORANGE", ADJSUFF="ORANGE")
> # print html head
> def print_html_head():
>     #print the head of the html page
> def print_html_tail():
>    # print the tail of the html page
> def color(segment, color):
>    # STUCK HERE shoudl take a color, and a segment for example
> # main
> import sys
> infile = open(sys.argv[1]) # takes as input the POS-tagged file
> print_html_head()
> for line in infile:
>     line = line.split()
>     if len(line) != 2: continue
>     word = line[0]
>     pos = line[1]
>     zipped = zip(word.split("+"), pos.split("+"))
>     for x, y in zipped:
>         if y in DET:
>             color(x, "#FF0000")
>         else:
>             color(x, "#0000FF")
> print_html_tail()       
> -- 
> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه 
> كالحقيقة.....محمد الغزالي
> "No victim has ever been more repressed and alienated than the truth"
> Emad Soliman Nawfal
> Indiana University, Bloomington
> --------------------------------------------------------
> ------------------------------------------------------------------------
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

Bob Gailer
Chapel Hill NC

More information about the Tutor mailing list