[Tutor] how avoid writing a newline?

Tommy Kaas tommy.kaas at kaasogmulvad.dk
Wed Jan 12 16:39:46 CET 2011


I'm using Activepython 2.6.6 on PC/Win7

 

I have made a small scraper script as an exercise for myself. 

It scrapes the name and some details of the first 25 billionaires on the
Forbes list.

It works and write the result in a text file, with the columns separated by
"#"

It takes the name from the link (t = i.string) - open the link and scrape
details from the next page.

But I can't find a way to write the name (the variable t) one and only one
time in the beginning of the line.

As t is written now I get it in the beginning of the line but I also get a
newline. 

Can I avoid that in a simple way?

 

Thanks in advance for any help

Tommy

 

 

from BeautifulSoup import BeautifulSoup

from mechanize import Browser

f = open("forbes.txt", "w")

br = Browser()

url =
"http://www.forbes.com/lists/2010/10/billionaires-2010_The-Worlds-Billionair
es_Rank.html"

page = br.open(url)

html = page.read()

soup = BeautifulSoup(html)

table = soup.find("table")

l = table.findAll('a')

for i in l[5:]:

    t = i.string

    print t #to the monitor

    

    br.follow_link(text_regex=r"(.*?)"+t+"(.*?)")

    tekst = br.response().read()

    soup = BeautifulSoup(tekst)

    table1 = soup.find('table', id='billTable')

    rows = table1.findAll('tr')

    print >> f, t,"#" 

    for tr in rows:

        tds = tr.findAll(text=True)

        print >> f, tds[1].string,"#",tds[2].string,"#", 

    print >> f, '\r\n'

 

f.close()

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110112/0d64dc2c/attachment.html>


More information about the Tutor mailing list