On Wed, May 31, 2017 at 12:27 PM, Mathieu Dupuy <deronnax@gmail.com> wrote:
 
* iterating on the string the string, stopping when something is wrong
(might process almost all of the string and finally give up because
last part is wrong, EG incorrect microseconds or time zone. Penalize
invalid strings, best case when most of the strings to process are
valid)
* first checking the string is correct, then iterating over it and
handling each part. Early detection of incorrect strings, useless
overhead for valid string. Penalize valid strings, best case when most
of the strings to process are invalid).

I have a preference for solution #1.

I agree -- I suspect that it won't take much longer to convert the string than it would to validate it anyway. so (2) would add a lot of overhead.

Also -- I think it's fair to optimize for most strings being valid if you are parsing a lot of datetimes (the only time you care about performance), most of them had better be valid, or performance is your least concern.

Alexander Belopolsky wrote: 
As I mentioned at the bug tracker, I would prefer to start with the C implementation falling back to Python. This is what we do for strptime and I don't see why fromisoformat should be different. Let's focus of finalizing the desired behavior and getting the Python implementation checked in. We don't want to maintain two implementations while the features are still subject to revision. Once Python code is mature enough, we can implement the C acceleration.

it seems the isostring parsing is a single function, yes? Couldn't the work be done in parallel? if Mathieu wants to write a C version, it could be dropped in to datetime at any point.

Ideally, there would be a comprehensive test suite, and then there's little impact.


IIUC, an iso 8601 string has three parts:

date
time
tz-offset

so a function that returned:

date, time, offset = parse_iso(a_string)

could be plugged right into the rest of the implementation.

(I'm suggesting that deciding exactly what to do with the various options for offset, etc be kep t out of this particular function)

-CHB


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov