[Tutor] parse text file

Mon Feb 1 16:30:02 CET 2010

On Mon, Feb 1, 2010 at 1:19 PM, Kent Johnson <kent37 at tds.net> wrote:
> On Mon, Feb 1, 2010 at 6:29 AM, Norman Khine <norman at khine.net> wrote:
>
>> thanks, what about the whitespace problem?
>
> \s* will match any amount of whitespace includin newlines.

thank you, this worked well.

here is the code:

###
import re
file=open('producers_google_map_code.txt', 'r')
data =  repr( file.read().decode('utf-8') )

block = re.compile(r"""openInfoWindowHtml\(.*?\\ticon: myIcon\\n""")
b = block.findall(data)
block_list = []
for html in b:
	namespace = {}
	t = re.compile(r"""<strong>(.*)<\/strong>""")
	title = t.findall(html)
	for item in title:
		namespace['title'] = item
	u = re.compile(r"""a href=\"\/(.*)\">En savoir plus""")
	url = u.findall(html)
	for item in url:
		namespace['url'] = item
	g = re.compile(r"""GLatLng\((\-?\d+\.\d*)\,\\n\s*(\-?\d+\.\d*)\)""")
	lat = g.findall(html)
	for item in lat:
		namespace['LatLng'] = item
	block_list.append(namespace)

###

can this be made better?

>
> Kent
>