[Baypiggies] custom date parser

Aaron Maxwell amax at redsymbol.net
Wed Sep 3 21:53:18 CEST 2008


On Wednesday 03 September 2008 11:22:15 am you wrote:
> Assuming you want exactly the functionality you've coded 
That is correct. 

> (as opposed 
> to a more general fuzzy parse as Anna suggests), the key to making it 
> more compact is seeing that what you're doing in your code is an
> unrolled loop -- and rolling it up again.  E.g., after the initial
> assignment to parts, the rest of the code could be:
>
> for i in range(1, 4):
>   try: return datetime.datetime(*parts)
>   except ValueError: parts[-i] = 1
> return None

That is the insight I was looking for.  Thanks Alex.

Anna, thanks for suggesting dateutils - not what's needed for this particular 
problem, but I was not familiar with it.

Aaron

>
> On Wed, Sep 3, 2008 at 10:12 AM, Aaron Maxwell <amax at redsymbol.net> wrote:
> > Hi all,
> >
> > Below is a function that parses a date string in the form "YYYY-MM-DD"
> > and returns a datetime.date object, or None if it's bad input and
> > cannot be converted.  However, it does a couple of special tricks.
> > The data in certain cases is known to have a value for the day or
> > month that is not in the valid range; e.g., it may be 2007-11-31
> > (November traditionally only has 30 days), or 2002-14-23.  In this
> > situation, I want to keep the most signficant good field(s) and set
> > the lessors to 1, then return the date object from that - so the
> > results of the above would be date(2007, 11, 1) or date(2002, 1, 1)
> > respectively.
> >
> > The function below does this.  It uses a triply-nested try/except
> > block, and I can't shake the feeling that there is a shorter and
> > clearer implementation.  Any thoughts?
> >
> > Of course, one approach would be to manually check that the month and
> > day field before passing them to datetime.date.  I would rather reuse
> > the validation code in the date class, though, for obvious reasons.
> >
> > Thanks in advance,
> > Aaron
> >
> > {{{
> > import datetime
> > def parse_datefield(raw_pubdate):
> >    '''
> >    Parse a datefield
> >    Takes in a date string in the format YYYY-MM-DD.
> >    Returns a datetime.date object.
> >    '''
> >    # ... imagine validation/error checking code here ...
> >    parts = map(int, raw_pubdate.split('-'))
> >    try:
> >        d = datetime.date(*parts)
> >    except ValueError:
> >        # day out of range?
> >        parts[-1] = 1
> >        try:
> >            d = datetime.date(*parts)
> >        except ValueError:
> >            # month out of range?
> >            parts[-2] = 1
> >            try:
> >                d = datetime.date(*parts)
> >            except ValueError:
> >                # give up
> >                d = None
> >    return d
> > }}}
> >
> > --
> > Aaron Maxwell
> > http://redsymbol.net
> > _______________________________________________
> > Baypiggies mailing list
> > Baypiggies at python.org
> > To change your subscription options or unsubscribe:
> > http://mail.python.org/mailman/listinfo/baypiggies



-- 
Aaron Maxwell
http://redsymbol.net


More information about the Baypiggies mailing list