[Tutor] html color coding: where to start
bob gailer
bgailer at gmail.com
Thu Sep 17 19:36:41 CEST 2009
Emad Nawfal (عماد نوفل) wrote:
> Hi Tutors,
> I want to color-code the different parts of the word in a
> morphologically complex natural language. The file I have looks like
> this, where the fisrt column is the word, and the second is the
> composite part of speech tag. For example, Al is a DETERMINER, wlAy is
> a NOUN and At is a PLURAL NOUN SUFFIX
>
> Al+wlAy+At DET+NOUN+NSUFF_FEM_PL
> Al+mtHd+p DET+ADJ+NSUFF_FEM_SG
>
> The output I want is one on which the word has no plus signs, and each
> segment is color-coded with a grammatical category. For example, the
> noun is red, the det is green, and the suffix is orange. Like on this
> page here:
> http://docs.google.com/View?id=df7jv9p9_3582pt63cc4
> I am stuck with the html part and I don't know where to start. I have
> no experience with html, but I have this skeleton (which may not be
> the right thing any way)
> Any help with materials, modules, suggestions appreciated.
>
> This skeleton of my program is as follows:
>
> #############
> RED = ("NOUN", "ADJ")
> GREEN = ("DET", "DEMON")
> ORANGE = ("NSUFF", "VSUFF", "ADJSUFF")
Instead of that use a dictionary:
colors = dict(NOUN="RED", ADJ="RED",DET ="GREEn",DEMON ="GREEN",
NSUFF="ORANGE", VSUFF="ORANGE", ADJSUFF="ORANGE")
> # print html head
> def print_html_head():
> #print the head of the html page
>
> def print_html_tail():
> # print the tail of the html page
>
> def color(segment, color):
> # STUCK HERE shoudl take a color, and a segment for example
>
> # main
> import sys
> infile = open(sys.argv[1]) # takes as input the POS-tagged file
> print_html_head()
> for line in infile:
> line = line.split()
> if len(line) != 2: continue
> word = line[0]
> pos = line[1]
> zipped = zip(word.split("+"), pos.split("+"))
>
> for x, y in zipped:
> if y in DET:
> color(x, "#FF0000")
> else:
> color(x, "#0000FF")
>
>
> print_html_tail()
>
>
>
>
> --
> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه
> كالحقيقة.....محمد الغزالي
> "No victim has ever been more repressed and alienated than the truth"
>
> Emad Soliman Nawfal
> Indiana University, Bloomington
> --------------------------------------------------------
> ------------------------------------------------------------------------
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
--
Bob Gailer
Chapel Hill NC
919-636-4239
More information about the Tutor
mailing list