[Tutor] parse text file
Norman Khine
norman at khine.net
Mon Feb 1 16:30:02 CET 2010
On Mon, Feb 1, 2010 at 1:19 PM, Kent Johnson <kent37 at tds.net> wrote:
> On Mon, Feb 1, 2010 at 6:29 AM, Norman Khine <norman at khine.net> wrote:
>
>> thanks, what about the whitespace problem?
>
> \s* will match any amount of whitespace includin newlines.
thank you, this worked well.
here is the code:
###
import re
file=open('producers_google_map_code.txt', 'r')
data = repr( file.read().decode('utf-8') )
block = re.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""")
b = block.findall(data)
block_list = []
for html in b:
namespace = {}
t = re.compile(r"""<strong>(.*)<\/strong>""")
title = t.findall(html)
for item in title:
namespace['title'] = item
u = re.compile(r"""a href=\"\/(.*)\">En savoir plus""")
url = u.findall(html)
for item in url:
namespace['url'] = item
g = re.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""")
lat = g.findall(html)
for item in lat:
namespace['LatLng'] = item
block_list.append(namespace)
###
can this be made better?
>
> Kent
>
More information about the Tutor
mailing list