Re: [Datetime-SIG] strategy for the C part of ISO 8601 datetime parsing
On Wed, May 31, 2017 at 12:27 PM, Mathieu Dupuy <deronnax@gmail.com> wrote:
* iterating on the string the string, stopping when something is wrong (might process almost all of the string and finally give up because last part is wrong, EG incorrect microseconds or time zone. Penalize invalid strings, best case when most of the strings to process are valid) * first checking the string is correct, then iterating over it and handling each part. Early detection of incorrect strings, useless overhead for valid string. Penalize valid strings, best case when most of the strings to process are invalid).
I have a preference for solution #1.
I agree -- I suspect that it won't take much longer to convert the string than it would to validate it anyway. so (2) would add a lot of overhead. Also -- I think it's fair to optimize for most strings being valid if you are parsing a lot of datetimes (the only time you care about performance), most of them had better be valid, or performance is your least concern. Alexander Belopolsky wrote:
As I mentioned at the bug tracker, I would prefer to start with the C implementation falling back to Python. This is what we do for strptime and I don't see why fromisoformat should be different. Let's focus of finalizing the desired behavior and getting the Python implementation checked in. We don't want to maintain two implementations while the features are still subject to revision. Once Python code is mature enough, we can implement the C acceleration.
it seems the isostring parsing is a single function, yes? Couldn't the work be done in parallel? if Mathieu wants to write a C version, it could be dropped in to datetime at any point. Ideally, there would be a comprehensive test suite, and then there's little impact. IIUC, an iso 8601 string has three parts: date time tz-offset so a function that returned: date, time, offset = parse_iso(a_string) could be plugged right into the rest of the implementation. (I'm suggesting that deciding exactly what to do with the various options for offset, etc be kep t out of this particular function) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (1)
-
Chris Barker