a little parsing challenge ☺

Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9af6 at myhashismyemail.com
Tue Jul 19 13:33:40 EDT 2011


On 07/19/2011 01:14 PM, Xah Lee wrote:
> I added other unicode brackets to your list of brackets, but it seems
> your code still fail to catch a file that has mismatched curly quotes.
> (e.g.http://xahlee.org/p/time_machine/tm-ch04.html  )
>
> LOL Billy.
>
>   Xah

I suspect its due to the file mode being opened with 'rb' mode.  Also, 
the diction of characters at the top, the closing token is the key, 
while the opening one is the value.  Not sure if thats obvious.

Also returning the position of the first mismatched pair is somewhat 
ambiguous.  File systems store files as streams of octets (mine do 
anyways) rather than as characters.  When you ask for the position of 
the the first mismatched pair, do you mean the position as per 
file.tell() or do you mean the nth character in the utf-8 stream?

Also, you may have answered this earlier but I'll ask again anyways: You 
ask for the first mismatched pair, Are you referring to the inner most 
mismatched, or the outermost?  For example, suppose you have this file:

foo[(])bar

Would the "(" be the first mismatched character or would the "]"?

--
Bill



More information about the Python-list mailing list