[Tutor] better way to write this code

Norman Khine norman at khine.net
Thu Aug 23 20:33:19 CEST 2012


Hello,
I have this code  (http://pastie.org/4575790) which pulls data from a list
and then modifies some of the values such as the 'yield' entry, which has
entries like:

21
15
 ≤ 1000
 ≤ 20
2.2 - 30

so that they are cleaned up.

# -*- coding: UTF-8 -*-
# Norman Khine <norman at zmgc.net>
import operator, json
from BeautifulSoup import BeautifulSoup

combos={0: 'id',
2: 'country',
3: 'type',
5: 'lat',
6: 'lon',
12: 'name' }

TABLE_CONTENT = [['958','<a id="958F" href="javascript:c_row(\'958\')"
title="go to map"><img src="/images/c_map.png"
border="0"></a>','USA','Atmospheric','<a href="javascript:c_ol(\'958\')"
title="click date time to show origin_list (evid=958)">1945/07/16
11:29:45</a>','33.6753','-106.4747','','-.03','21','','','TRINITY','&nbsp;','&nbsp;','<a
href="javascript:c_md(\'958\')" title="click here to show source
data">SourceData</a>','&nbsp;'],['959','<a id="959F"
href="javascript:c_row(\'959\')" title="go to map"><img
src="/images/c_map.png" border="0"></a>','USA','Atmospheric','<a
href="javascript:c_ol(\'959\')" title="click date time to show origin_list
(evid=959)">1945/08/05
23:16:02</a>','34.395','132.4538','','-.58','15','','','LITTLEBOY','&nbsp;','&nbsp;','<a
href="javascript:c_md(\'959\')" title="click here to show source
data">SourceData</a>','&nbsp;'],['1906','<a id="1906F"
href="javascript:c_row(\'1906\')" title="go to map"><img
src="/images/c_map.png" border="0"></a>','GBR','Atmospheric','<a
href="javascript:c_ol(\'1906\')" title="click date time to show origin_list
(evid=1906)">1958/08/22 17:24:00</a>','1.67','-157.25','','&nbsp;',' &le;
1000','','','Pennant 2','&nbsp;','&nbsp;','<a
href="javascript:c_md(\'1906\')" title="click here to show source
data">SourceData</a>','&nbsp;'],['28','<a id="28F"
href="javascript:c_row(\'28\')" title="go to map"><img
src="/images/c_map.png" border="0"></a>','USA','Underground','<a
href="javascript:c_ol(\'28\')" title="click date time to show origin_list
(evid=28)">1961/09/16 19:45:00</a>','37.048','-116.034','0','.098',' &le;
20','','','SHREW','&nbsp;','&nbsp;','<a href="javascript:c_md(\'28\')"
title="click here to show source data">SourceData</a>','<a
href="javascript:c_es(\'NEDBMetadataYucca2.htm\');">US Yucca
Flat</a>'],['5393637','<a id="5393637F"
href="javascript:c_row(\'5393637\')" title="go to map"><img
src="/images/c_map.png" border="0"></a>','PRK','Underground','<a
href="javascript:c_ol(\'5393637\')" title="click date time to show
origin_list (evid=5393637)">2009/05/25
00:54:45</a>','41.2925','129.0657','','0','2.2 - 30','4.7','<a
href="javascript:c_stalist(\'5393637\')" title="click here to show stations
with waveform">45</a>','2009 North Korean Nuclear Test','<a
href="javascript:c_bull(\'5393637\')" title="click here to show
bulletin">Bulletin</a>','<a href="javascript:c_tres(\'5393637\')"
title="click here to show IASP91 time residuals with respect to preferred
solution">TimeRes</a>','<a href="javascript:c_md(\'5393637\')" title="click
here to show source data">SourceData</a>','<a
href="javascript:c_es(\'NEDBMetadataNKorea2009.htm\');">NK2009</a>']]

event_list = []
for event in TABLE_CONTENT:
event_dict = {}
for index, item in enumerate(event):
if index == 8:
if item == '&nbsp;':
event_dict['depth'] = '0'
else:
event_dict['depth'] = item
if index == 9:
try:
items = item.split()
if len(items) >= 2:
event_dict['yield'] = items[-1]
else:
if item == '&nbsp;':
event_dict['yield'] = '10'
else:
event_dict['yield'] = item
except:
pass
if index == 4:
soup = BeautifulSoup(item)
for a in soup.findAll('a'):
event_dict['date'] = ''.join(a.findAll(text=True))
if index == 3:
if 'Atmospheric' in item:
event_dict['fill'] = 'red'
if 'Underground' in item:
event_dict['fill'] = 'green'
elif index in combos:
event_dict[combos[index]]=item
event_list.append(event_dict)
print event_dict
event_list = sorted(event_list, key = operator.itemgetter('id'))

f = open('detonations.json', 'w')
f.write(json.dumps(event_list))
f.close()
print 'detonations.json, written!'

this then produces the .json file such as:

[{"name": "Pennant 2", "country": "GBR", "lon": "-157.25", "yield": "1000",
"lat": "1.67", "depth": "0", "date": "1958/08/22 17:24:00", "id": "1906",
"fill": "red"}, {"name": "SHREW", "country": "USA", "lon": "-116.034",
"yield": "20", "lat": "37.048", "depth": ".098", "date": "1961/09/16
19:45:00", "id": "28", "fill": "green"}, {"name": "2009 North Korean
Nuclear Test", "country": "PRK", "lon": "129.0657", "yield": "30", "lat":
"41.2925", "depth": "0", "date": "2009/05/25 00:54:45", "id": "5393637",
"fill": "green"}, {"name": "TRINITY", "country": "USA", "lon": "-106.4747",
"yield": "21", "lat": "33.6753", "depth": "-.03", "date": "1945/07/16
11:29:45", "id": "958", "fill": "red"}, {"name": "LITTLEBOY", "country":
"USA", "lon": "132.4538", "yield": "15", "lat": "34.395", "depth": "-.58",
"date": "1945/08/05 23:16:02", "id": "959", "fill": "red"}

can the code be improved further?

also, the content has 2,153 items, what will be the correct way to have
this in a separate file and import this within this file to work on it?

any advice much appreciated.

norman


-- 
%>>> "".join( [ {'*':'@','^':'.'}.get(c,None) or chr(97+(ord(c)-83)%26) for
c in ",adym,*)&uzq^zqf" ] )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120823/fce9c1c2/attachment.html>


More information about the Tutor mailing list