On Wed, May 31, 2017 at 12:27 PM, Mathieu Dupuy <deronnax(a)gmail.com> wrote:
> * iterating on the string the string, stopping when something is wrong
> (might process almost all of the string and finally give up because
> last part is wrong, EG incorrect microseconds or time zone. Penalize
> invalid strings, best case when most of the strings to process are
> valid)
> * first checking the string is correct, then iterating over it and
> handling each part. Early detection of incorrect strings, useless
> overhead for valid string. Penalize valid strings, best case when most
> of the strings to process are invalid).
>
> I have a preference for solution #1.
I agree -- I suspect that it won't take much longer to convert the string
than it would to validate it anyway. so (2) would add a lot of overhead.
Also -- I think it's fair to optimize for most strings being valid if you
are parsing a lot of datetimes (the only time you care about performance),
most of them had better be valid, or performance is your least concern.
Alexander Belopolsky wrote:
>
> As I mentioned at the bug tracker, I would prefer to start with the C
> implementation falling back to Python. This is what we do for strptime and
> I don't see why fromisoformat should be different. Let's focus of
> finalizing the desired behavior and getting the Python implementation
> checked in. We don't want to maintain two implementations while the
> features are still subject to revision. Once Python code is mature enough,
> we can implement the C acceleration.
it seems the isostring parsing is a single function, yes? Couldn't the work
be done in parallel? if Mathieu wants to write a C version, it could be
dropped in to datetime at any point.
Ideally, there would be a comprehensive test suite, and then there's little
impact.
IIUC, an iso 8601 string has three parts:
date
time
tz-offset
so a function that returned:
date, time, offset = parse_iso(a_string)
could be plugged right into the rest of the implementation.
(I'm suggesting that deciding exactly what to do with the various options
for offset, etc be kep t out of this particular function)
-CHB
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker(a)noaa.gov