[Moin-user] Parser für simplen Chat

Thilo Pfennig tpfennig at gmail.com
Tue May 9 04:59:06 EDT 2006


Hi, I've tried to make a parser for a simple chat. I have looked into
the IRC parser but did not quiet get how it works.


The chat format is so simple:
"User Name: Message"

I think it should use regex because if ":" would be seperator in a CSV
style parser also ":-)" would get fltered.

I have found this part of IRC parser to be the most important:


    def format(self, formatter):
        lines = self.raw.split('\n')
        # TODO: Add support for displaying things like join and part messages.
        pattern = re.compile(r"""
            ((\[|\()?                      # Opening bracket for the
timestamp (if it exists)
                (?P<time>([\d]?\d[:.]?)+)  # Timestamp as one or more
:/.-separated groups of 1 or 2 digits (if it exists)
            (\]|\))?\s+)?                  # Closing bracket for the
timestamp (if it exists) plus whitespace
            <\s*?(?P<nick>.*?)\s*?>        # Nick
            \s+                            # Space between the nick and message
            (?P<msg>.*)                    # Message
        """, re.VERBOSE + re.UNICODE)
        self.out.write(formatter.table(1))
        for line in lines:
            match = pattern.match(line)
            if match:
                self.out.write(formatter.table_row(1))
                for g in ('time', 'nick', 'msg'):
                    self.out.write(formatter.table_cell(1))
                    self.out.write(formatter.text(match.group(g) or ''))
                    self.out.write(formatter.table_cell(0))
                self.out.write(formatter.table_row(0))
        self.out.write(formatter.table(0))


I  had reduced the regex to:

"(r"""
            (?P<nick>.*?):        # Nick
            \s+                            # Space between the nick and message
            (?P<msg>.*)                    # Message
        """

I understand that the ouput gets formatted like this

1.    self.out.write(formatter.table(1))
(Make a table)
2.   self.out.write(formatter.table_row(1))
(Make a row)

3.               for g in ('time', 'nick', 'msg'):
(3 "fields")

4.        self.out.write(formatter.table_cell(1))
(does  this mean open cell ( <td> ?))

5. self.out.write(formatter.text(match.group(g) or ''))
( is this "pasting" content of 1 of the 3 values inside td ?

Ok any hints would be nice.

A nice feature would be this behaviour:
* every user gets a color (maybe one of 256 web colors?)


Thilo

--
http://vinci.wordpress.com




More information about the Moin-user mailing list