<br><br><div class="gmail_quote">On Mon, Feb 22, 2010 at 10:46 PM, Stefan Behnel <span dir="ltr">&lt;<a href="mailto:stefan_ml@behnel.de">stefan_ml@behnel.de</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="im">sharifah ummu kulthum, 22.02.2010 14:24:<br>
</div><div class="im">&gt;   File &quot;grabmy.py&quot;, line 63, in get_html<br>
&gt;     return BeautifulSoup(content)<br>
&gt;   File &quot;build/bdist.linux-i686/egg/BeautifulSoup.py&quot;, line 1499, in __init__<br>
&gt;   File &quot;build/bdist.linux-i686/egg/BeautifulSoup.py&quot;, line 1230, in __init__<br>
&gt;   File &quot;build/bdist.linux-i686/egg/BeautifulSoup.py&quot;, line 1263, in _feed<br>
&gt;   File &quot;/usr/lib/python2.6/HTMLParser.py&quot;, line 108, in feed<br>
&gt;     self.goahead(0)<br>
&gt;   File &quot;/usr/lib/python2.6/HTMLParser.py&quot;, line 148, in goahead<br>
&gt;     k = self.parse_starttag(i)<br>
&gt;   File &quot;/usr/lib/python2.6/HTMLParser.py&quot;, line 226, in parse_starttag<br>
&gt;     endpos = self.check_for_whole_start_tag(i)<br>
&gt;   File &quot;/usr/lib/python2.6/HTMLParser.py&quot;, line 301, in<br>
&gt; check_for_whole_start_tag<br>
&gt;     self.error(&quot;malformed start tag&quot;)<br>
&gt;   File &quot;/usr/lib/python2.6/HTMLParser.py&quot;, line 115, in error<br>
&gt;     raise HTMLParseError(message, self.getpos())<br>
&gt; HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36<br>
<br>
</div>Just noticed this now - you seem to be using BeautifulSoup, likely version<br>
3.1. This version does not support parsing broken HTML any well, so use<br>
version 3.0.8 instead, or switch to the tools I indicated.<br>
<br>
Note that switching tools means that you need to change your code to use<br>
them. Just installing them is not enough.<br>
<font color="#888888"><br>
Stefan<br>
<br>
</font></blockquote></div><br>I am so sorry but I really don&#39;t know how to change the code as I have just learn python. How am I going to switch the version or to change the code? Because I don&#39;t really understand the code.<br>
<br>Here is the code:<br><br><span style="font-family: courier new,monospace;">&#39;&#39;&#39;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">Copyright (c) 2008  Yap Sok Ann &lt;<a href="mailto:sayap@sayap.com">sayap@sayap.com</a>&gt;</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">This module contains xmltv grabbers for Malaysia channels.</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">&#39;&#39;&#39;</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">__author__ = &#39;Yap Sok Ann &lt;<a href="mailto:sayap@sayap.com">sayap@sayap.com</a>&gt;&#39;</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">__license__ = &#39;PSF License&#39;</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">import logging</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">from datetime import date as dt</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">from datetime import datetime, time, timedelta</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">from <a href="http://dateutil.tz">dateutil.tz</a> import tzlocal</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">from httplib2 import Http</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">from lxml import etree</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">from urllib import urlencode</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">from BeautifulSoup import BeautifulSoup</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">channels = [&#39;rtm1&#39;, &#39;rtm2&#39;, &#39;tv3&#39;, &#39;ntv7&#39;, &#39;8tv&#39;, &#39;tv9&#39;]</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">datetime_format = &#39;%Y%m%d%H%M%S %z&#39;</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">h = Http()</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">h.force_exception_to_status_code = True</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">#h.timeout = 15</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">logging.basicConfig(</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    level=logging.DEBUG,</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    format=&#39;%(asctime)s %(levelname)-8s %(process)d %(message)s&#39;,</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">log = logging.getLogger(__name__)</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">def strclean(s):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    s = s.strip().replace(&#39;&amp;lsquo;&#39;, &#39;\&#39;&#39;).replace(&#39;&amp;rsquo;&#39;, &#39;\&#39;&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    if s != &#39;&amp;nbsp;&#39;:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        return s</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">class Grabber(object):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    base_url = None</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def __init__(self, channel):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        self.channel = channel</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        self.url = self.base_url</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def qs_params(self, date, **kwargs):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        &#39;&#39;&#39;Returns a dict of params to form the url&#39;s query string</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        &#39;&#39;&#39;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        raise NotImplementedError</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    def _parse_html(self, date, html):</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        &#39;&#39;&#39;Returns a list of dicts with the following keys:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        - mandatory: title, start</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        - optional: stop, sub_title, desc, episode_number, episode_system</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        &#39;&#39;&#39;</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        raise NotImplementedError</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def get_html(self, date, **kwargs):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        params = self.qs_params(date, **kwargs)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        response, content = h.request(self.url + &#39;?&#39; + urlencode(params))</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        if response.status == 200:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            return BeautifulSoup(content)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        else:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            log.error(&#39;Status: %s\nContent: %s&#39; % (response.status, content))</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def parse_html(self, date, html):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        prev_schedule = None</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        try:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            for schedule in self._parse_html(date, html):</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                if &#39;stop&#39; in schedule:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                    yield schedule</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                elif prev_schedule:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                    prev_schedule[&#39;stop&#39;] = schedule[&#39;start&#39;]</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                    yield prev_schedule</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                prev_schedule = schedule</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        except:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            log.exception(&#39;Cannot parse html for date %s&#39; % date)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    def to_xml(self, schedules):</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        for schedule in schedules:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            program = etree.Element(&#39;programme&#39;, channel=self.channel,</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                start=schedule[&#39;start&#39;].strftime(datetime_format),</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                stop=schedule[&#39;stop&#39;].strftime(datetime_format))</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            title = etree.SubElement(program, &#39;title&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            title.text = schedule[&#39;title&#39;]</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            if schedule.get(&#39;episode_num&#39;):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                episode_num = etree.SubElement(program, &#39;episode-num&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                episode_num.set(&#39;system&#39;, schedule.get(&#39;episode_system&#39;))</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                episode_num.text = schedule[&#39;episode_num&#39;]</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            for field in [&#39;sub_title&#39;, &#39;desc&#39;]:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                if schedule.get(field):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                    elem = etree.SubElement(program, field.replace(&#39;_&#39;, &#39;-&#39;))</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                    elem.text = schedule[field]</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            yield program</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def grab(self, date, **kwargs):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        html = self.get_html(date, **kwargs)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        if html:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            return self.to_xml(self.parse_html(date, html))</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">class Astro(Grabber):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    base_url = &#39;<a href="http://www.astro.com.my/channels/%(channel)s/Default.asp">http://www.astro.com.my/channels/%(channel)s/Default.asp</a>&#39;</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    params_dicts = [dict(batch=1),</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                    dict(batch=2)]</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    ignores = [&#39;No Transmission&#39;, &#39;Transmission Ends&#39;]</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def __init__(self, channel):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        self.channel = channel</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        self.url = self.base_url % dict(channel=channel)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def qs_params(self, date, **kwargs):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        kwargs[&#39;sDate&#39;] = date.strftime(&#39;%d-%b-%Y&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        return kwargs</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def _parse_html(self, date, html):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        header_row = html.find(&#39;tr&#39;, bgcolor=&#39;#29487F&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        for tr in header_row.fetchNextSiblings(&#39;tr&#39;):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            tds = tr.findChildren(&#39;td&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            title = strclean(tds[1].find(&#39;a&#39;).string)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            if title in self.ignores:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                continue</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            # start time, &#39;21:00&#39; -&gt; 9 PM</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            hour, minute = [int(x) for x in tds[0].string.split(&#39;:&#39;)]</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            start = datetime.combine(date,</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                                     time(hour, minute, tzinfo=tzlocal()))</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            # duration, &#39;00:30&#39; -&gt; 30 minutes</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            hours, minutes = [int(x) for x in tds[2].string.split(&#39;:&#39;)]</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            stop = start + timedelta(hours=hours, minutes=minutes)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            yield dict(title=title, start=start, stop=stop)</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">class TheStar(Grabber):</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    base_url = &#39;<a href="http://star-ecentral.com/tvnradio/tvguide/guide.asp">http://star-ecentral.com/tvnradio/tvguide/guide.asp</a>&#39;</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    params_dicts = [dict(db=&#39;live&#39;)]</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    def qs_params(self, date, **kwargs):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        kwargs[&#39;pdate&#39;] = date.strftime(&#39;%m/%d/%Y&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        kwargs[&#39;chn&#39;] = self.channel.replace(&#39;rtm&#39;, &#39;tv&#39;)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        return kwargs</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    def _parse_html(self, date, html):</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        last_ampm = None</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        header_row = html.find(&#39;tr&#39;, bgcolor=&#39;#5e789c&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        for tr in header_row.fetchNextSiblings(&#39;tr&#39;):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            tds = tr.findChildren(&#39;td&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            schedule = {}</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            schedule[&#39;title&#39;] = strclean(tds[1].find(&#39;b&#39;).find(&#39;font&#39;).string)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            schedule[&#39;desc&#39;] = strclean(tds[2].find(&#39;font&#39;).string)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            episode_num = strclean(tds[3].find(&#39;font&#39;).string)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            if episode_num:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                try:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                    episode_num = int(episode_num) - 1</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                    episode_num = &#39;.&#39; + str(episode_num) + &#39;.&#39;</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                    episode_system = &#39;xmltv_ns&#39;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                except ValueError:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                    episode_system = &#39;onscreen&#39;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                schedule[&#39;episode_num&#39;] = episode_num</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                schedule[&#39;episode_system&#39;] = episode_system</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            # start time, &#39;9.00pm&#39; -&gt; 9 PM</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            time_str = tds[0].find(&#39;font&#39;).string</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            ampm = time_str[-2:]</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            hour, minute = [int(x) for x in time_str[:-2].split(&#39;.&#39;)]</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            if ampm == &#39;pm&#39; and hour &lt; 12:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                hour += 12</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            elif ampm ==&#39;am&#39; and hour == 12:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                hour = 0</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            if last_ampm == &#39;pm&#39; and ampm == &#39;am&#39;:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                date = date + timedelta(1)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            schedule[&#39;start&#39;] = datetime.combine(</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                date, time(hour, minute, tzinfo=tzlocal()))</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            last_ampm = ampm</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">            </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            yield schedule</span><br style="font-family: courier new,monospace;">
<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">def main():</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    from optparse import OptionParser</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    parser = OptionParser()</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    parser.add_option(&#39;-s&#39;, &#39;--source&#39;, dest=&#39;source&#39;,</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        help=&#39;SOURCE to grab from: Astro, TheStar. Default: TheStar&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    parser.add_option(&#39;-d&#39;, &#39;--date&#39;, dest=&#39;date&#39;,</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        help=&#39;Start DATE to grab schedules for (YYYY-MM-DD). Default: today&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    parser.add_option(&#39;-n&#39;, &#39;--days&#39;, dest=&#39;days&#39;,</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        help=&#39;Number of DAYS to grab schedules for. Default: 1&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    parser.add_option(&#39;-f&#39;, &#39;--file&#39;, dest=&#39;filename&#39;, metavar=&#39;FILE&#39;,</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        help=&#39;Output FILE to write to. Default: stdout&#39;)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    options, args = parser.parse_args()</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    if options.source is None:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        cls = TheStar</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    else:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        cls = globals()[options.source]</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    if options.date is None:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        date = dt.today()</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    else:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        date = dt(*[int(x) for x in options.date.split(&#39;-&#39;)])</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    if options.days is None:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        days = 1</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    else:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        days = int(options.days)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    root = etree.Element(&#39;tv&#39;)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    for channel in channels:</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        grabber = cls(channel)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        for i in range(days):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">            for params_dict in cls.params_dicts:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                for elem in grabber.grab(date + timedelta(i), **params_dict):</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">                    root.append(elem)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    </span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    xml = etree.tostring(root, encoding=&#39;UTF-8&#39;, xml_declaration=True,</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">                         pretty_print=True)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    if options.filename is None:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        print xml</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    else:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">        open(options.filename, &#39;w&#39;).write(xml)</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">if __name__ == &#39;__main__&#39;:</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">    main()</span><br style="font-family: courier new,monospace;"><br>