a little parsing challenge ☺

Xah Lee xahlee at gmail.com
Sun Jul 17 09:47:42 CEST 2011


folks, this one will be interesting one.

the problem is to write a script that can check a dir of text files
(and all subdirs) and reports if a file has any mismatched matching

• The files will be utf-8 encoded (unix style line ending).

• If a file has mismatched matching-pairs, the script will display the
file name, and the  line number and column number of the first
instance where a mismatched bracket occures. (or, just the char number
instead (as in emacs's “point”))

• the matching pairs are all single unicode chars. They are these and
nothing else: () {} [] “” ‹› «» 【】 〈〉 《》 「」 『』
Note that ‘single curly quote’ is not consider matching pair here.

• You script must be standalone. Must not be using some parser tools.
But can call lib that's part of standard distribution in your lang.

Here's a example of mismatched bracket: ([)], (“[[”), ((, 】etc. (and
yes, the brackets may be nested. There are usually text between these

I'll be writing a emacs lisp solution and post in 2 days. Ι welcome
other lang implementations. In particular, perl, python, php, ruby,
tcl, lua, Haskell, Ocaml. I'll also be able to eval common lisp
(clisp) and Scheme lisp (scsh), Java. Other lang such as Clojure,
Scala, C, C++, or any others, are all welcome, but i won't be able to
eval it. javascript implementation will be very interesting too, but
please indicate which and where to install the command line version.

I hope you'll find this a interesting “challenge”. This is a parsing
problem. I haven't studied parsers except some Wikipedia reading, so
my solution will probably be naive. I hope to see and learn from your
solution too.

i hope you'll participate. Just post solution here. Thanks.


More information about the Python-list mailing list