clear the files using python
Peter Hansen
peter at engcorp.com
Mon May 9 08:48:27 EDT 2005
Sez sez:
> Each file's structure as below:
> Comments: This is article 1965 obtained from the website
> Title: Banana Report #65, September 2003
> Author: dylab
> Date: 1st September 2003
> Section: pulse
>
> In the past month:
> A mass hit North America, cutting electricity to 50 million people
> across the North east
>
>
> I'm expected execute the python script so the file suppose to look like
> this:
>
> pulse, In, the, past, month, A, mass, hit, North, America, cutting,
> electricity, to, 50, million, people, across, the, North east, dylab
You'll need either more examples or a more detailed description. The
above could be interpreted as something like "put the pulse section
first, then exactly 19 words from the following text, removing
punctuation and line breaks, and taking the last two words together as
one, then add the 'author' field, and write them all out together with a
field separator of ', ' (comma plus space)".
On the other hand, it could be interpreted a large number of other ways,
and since none of us have any idea what you are trying to do with the
results, we can't use our own intuition or experience to help.
I also personally find it hard to respond to questions like this with
real code when there are things about the task which I find very
surprising. For example, you're throwing away the date information
entirely, along with the comments and title. Is that really intended?
And are the author and section fields always exactly one word, with no
punctuation? (What would happen if an author's name was "Hansen,
Peter"? How would you format that in the output without getting the
first name confused with the next field?)
> Could you please point me to right direction here. Or provide some
> example code. In the mean time I'll be searching myself. I know you
> guys hate novice people like me but I would appreciated if you could
> provide little help here.
We don't "hate" novice people by any means... I suspect you are either
trying to be self-deprecating or maybe you just haven't read this
newsgroup for long. c.l.p actually *loves* novices; it just doesn't
prefer questions that aren't very clear. Keep trying (and improving!)
and you'll definitely get the help you need.
And your comment about Python being the best language for this is pretty
close to the mark... but there are certainly a variety of ways to go
about the task and the best might depend on a lot of unanswered questions.
-Peter
More information about the Python-list
mailing list