<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">I am working scraping the Weather Underground using the XML interface... <div><br></div><div>I am hoping to to add this into the pywapi, but that looks like it's been abandoned? I haven't seen any updates in ages to it...</div><div><br></div><div>And I'm using the Weather Underground XML API (<a href="http://wiki.wunderground.com/index.php/API_-_XML">http://wiki.wunderground.com/index.php/API_-_XML</a>)... And it's working, except something is happening odd with the Forecast portion...</div><div><br></div><div>When parsed the Forecast, the Highs & Lows are the same value...</div><div><br></div><div>I don't see another approach to this though. Weather Underground is presenting the same data structure for the forecast, which is why I am breaking it into a list... I'm not the best expert at XML, but I believe that I have etree working fine... But not necessarily the best way, Is there a better way to read this via etree?</div><div><br></div><div>The only limitation I have is the code has to be python 2.51, due to limitations in the Indigo framework...</div><div><br></div><div>The scan_node function scans the individual node, and works fine for the Weather forecast... but due to the duplicate XML tags in the forecast XML interface, I had to manually break it out into a list...</div><div><br></div><div>But this doesn't explain the issue with the high's not being read properly...</div><div><br></div><div>Anyone?</div><div><div><br></div><div>WUND_WEATHER_URL<span class="Apple-tab-span" style="white-space:pre"> </span>= 'http://api.wunderground.com/auto/wui/geo/WXCurrentObXML/index.xml?query=%s'</div><div>WUND_FORECAST_URL<span class="Apple-tab-span" style="white-space:pre"> </span>= 'http://api.wunderground.com/auto/wui/geo/ForecastXML/index.xml?query=%s'</div><div>WUND_PWS_WEATHER_URL = 'http://api.wunderground.com/weatherstation/WXCurrentObXML.asp?ID=%s'</div><div><br></div><div>def<span class="Apple-tab-span" style="white-space:pre"> </span>scan_node ( data, node, ns_wund_data_structure):</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for (category, attrs) in ns_wund_data_structure.iteritems():</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if node.tag in attrs:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for attrsname in attrs:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if attrsname == node.tag:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if not(category in data.keys() ):</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>#</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>#<span class="Apple-tab-span" style="white-space:pre"> </span>key not in dictionary, create subdictionary</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>#</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>data [category] = {}</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if node.text <> None:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>data [category] [node.tag.strip()] = node.text.strip()</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>return data</div><div><br></div><div>def get_weather_from_wund(location_id, hl = ''):</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>url = WUND_WEATHER_URL % (location_id)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>handler = urllib2.urlopen(url)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>tree = parse ( handler)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>handler.close()</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data = {}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>elem = tree.getroot ()</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>ns_wund_data_structure = { </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'display_location': ('full', 'city', 'state', 'state_name', 'country', 'zip', 'latitude', 'longitude', 'elevation'),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'current_observation': ('station_id', 'observation_time', 'observation_time_rfc822', 'local_time', 'local_time_rfc822',</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'local_epoch', 'weather', 'temperature_string', 'temp_f', 'temp_c', 'relative_humidity',</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'wind_string', 'wind_dir', 'wind_degrees', 'wind_mpg', 'wind_gust', 'pressure_string',</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'pressure_mb', 'pressure_in', 'dewpoint_string', 'dewpoint_f', 'dewpoint_c', </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'heat_index_string', 'heat_index_f', 'heat_index_c', 'windchill_string', 'windchill_f', </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'windchill_c', 'visibility_mi', 'visibility_km', 'forceast_url','history_url',</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'ob_url', 'icon_url_base', 'icon_url_name', 'icon', 'forecast_url'),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'icons'<span class="Apple-tab-span" style="white-space:pre"> </span>: ('icon_set', 'icon_url', 'icon_url_base', 'icon_url_name', 'icon')</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>}<span class="Apple-tab-span" style="white-space:pre"> </span></div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for category in ns_wund_data_structure:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data[category] = {}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for node in elem.getchildren():</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>children = node.getchildren()</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if children <> []:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for subnode in children:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data = scan_node( weather_data, subnode, ns_wund_data_structure)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>else:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data = scan_node ( weather_data, node, ns_wund_data_structure)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>return weather_data</div><div><br></div><div>def<span class="Apple-tab-span" style="white-space:pre"> </span>walk_tree (root_node, data, dstructure):</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for node in root_node.getchildren():</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>children = node.getchildren()</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if children <> []:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for subnode in children:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if subnode.getchildren() <> []:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>walk_tree (subnode, data, dstructure)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>else:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>data = scan_node ( data, subnode, dstructure)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>else:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>data = scan_node ( data, node, dstructure)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>return data</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div>def get_forecast_from_wund(location_id, weather_data = None, hl = ''):</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>url = WUND_FORECAST_URL % (location_id)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>handler = urllib2.urlopen(url)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>tree = parse ( handler)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>handler.close()</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if weather_data == None:</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data = {}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>elem = tree.getroot ()</div><div><br></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>ns_forecast_structure = { </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'txt_forecast'<span class="Apple-tab-span" style="white-space:pre"> </span>: ( 'number', 'forecastday'),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'high'<span class="Apple-tab-span" style="white-space:pre"> </span>: ('fahrenheit', 'celsius'),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'low'<span class="Apple-tab-span" style="white-space:pre"> </span>: ('fahrenheit', 'celsius'),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'simpleforecast': ('forecastday', 'conditions', 'icon', 'skyicon'),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'forecastday'<span class="Apple-tab-span" style="white-space:pre"> </span>: ('period', 'title', 'fcttext', 'date', 'high', 'low', 'conditions', 'icon', 'skyicon'),</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'date'<span class="Apple-tab-span" style="white-space:pre"> </span>: ('epoch', 'pretty_short', 'pretty', 'day', 'month', 'year', 'yday','hour', 'min', </div><div><span class="Apple-tab-span" style="white-space:pre"> </span>'sec', 'isdst', 'monthname', 'weekday_short', 'weekday', 'ampm', 'tz_short', 'tz_long')<span class="Apple-tab-span" style="white-space:pre"> </span>}<span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data = walk_tree (elem, weather_data, ns_wund_data_structure)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data["forecast"] = []</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>forecast_data = {}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>forecast_root = tree.find ("//simpleforecast")</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for subnode in forecast_root.getchildren():</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>forecast_data = {}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>forecast_data = walk_tree (subnode, forecast_data, ns_forecast_structure)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>weather_data["forecast"].append (forecast_data)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>return weather_data</div></div><div><br></div></body></html>