cs at zip.com.au
Sun Dec 2 23:11:51 CET 2012
On 02Dec2012 07:02, subhabangalore at gmail.com <subhabangalore at gmail.com> wrote:
| On Sunday, December 2, 2012 5:39:32 PM UTC+5:30, subhaba... at gmail.com wrote:
| > I am using NLTK and I used the following command,
| > chunk=nltk.ne_chunk(tag)
| > print "The Chunk of the Line Is:",chunk
| > The Chunk of the Line Is: (S
| > ''/''
| > It/PRP
| > Now I am trying to split the output preferably by ",/,".
| Sorry to ask this. I converted in string and then splitted it.
I'm glad you solved your problem, but I would like to point out that
this is generally a risky way of manipulating data.
The problem arises if the string you're splitting on occurs as a literal
piece of text, but _not_ in the sense you intend. It may be the case
that it will not happen in your particular situation, but in general the
- convert structure to string somehow
- perfect simple text manipulation
is at risk of simplistic parsing of the string.
A common example is with CSV data. Supposing you wanted the the third
column from an array of tuples:
rows = [ (1,2,"A",4),
and you wanted [ "A", "B", "C,D" ]. If one went with the "convert to
text" approach, and decided that converting each tuple to a CSV style
data row was a good idea you might write:
column_3 = 
for row in rows:
csv_string = ",".join( str(item) for item in row )
item3 = csv_string.split(",")
The (simplistic) code above with give you "C" from the third row, not
"C,D". Because it naively assumes there are no commas in the data, and
then does a simplistic textual split to find the third column.
Obviously you woldn't really do that for something this simple; it is to
show the issue. But your situation where manipulating a tree was tricky
and you converted it to a string is very similar conceptually.
Hoping this shows you the issue,
Cameron Simpson <cs at zip.com.au>
I'm not making any of this up you know. - Anna Russell
More information about the Python-list