[Tutor] Parse Problem [2] (fwd)

Mon, 12 Aug 2002 11:05:47 -0700 (PDT)

Hi fleet,

I'm forwarding your message to the Tutor mailing list, because it gives
other people the opportunity to offer more suggestions.

I'll play around with 'csv' a little bit and see if it can use multiple
delimiters --- perhaps there's a "whitespace" setting that will work for
all kinds of whitespace, so that we won't have to introduce tabs into the
program input.

---------- Forwarded message ----------
Date: Mon, 12 Aug 2002 13:34:41 -0400 (EDT)
From: fleet@teachout.org
To: Danny Yoo <dyoo@hkn.eecs.berkeley.edu>
Subject: Re: [Tutor] Parse Problem [2]

Hmmm.  The file I'm parsing is output from another program (procmail log)
- so I don't have the option of formatting the output to include tabs or
commas or whatever.  It just struck me that I may be better off just
slicing the string and removing the pieces I don't want.

My bash "shell" command to provide this function is:
grep 'Matched "' log | cut -d" " -f3 | sed 's/\"//g'
and I think slice would accomplish the task in roughly the same way.

Thanks, for the response!

				- fleet -

PS: In case you don't recall the original post, the string in the log
looks like:
procmail: Matched "Flicks Softwares"
and similar.

On Mon, 12 Aug 2002, Danny Yoo wrote:

>
>
> On Mon, 12 Aug 2002 fleet@teachout.org wrote:
>
> > Ok, I'll try to be less verbose.
> >
> > Given the strings 'xxxxx: xxxxx "xxx xxx"' and 'xxxxx: xxxxx "xxxxxxx"'
> > If I use xx=split.string(string," ") I get xx[2] as "xxx in the first
> > instance and "xxxxxxx" in the second.  I need to get "xxx xxx".  Ie, I
> > need to get everything within the quotes as xx[2].
>
> Hi fleet,
>
> Ah, so a simple string split doesn't work because it doesn't know what it
> means to "quote" something.
>
> You may want to use the CSV module for your program:
>
>     http://www.object-craft.com.au/projects/csv/
>
> It's supposed to handle quotes, and you can set the splitting character to
> tab by doing something like this:
>
> ###
> p = csv.parser(field_sep='\t')
> ###
>
>
> Good luck!
>