Hello,
it occured to me today that it's not currently possible to parse ISO 8601 dates using the datetime.datetime.strftime function (I was parsing datetimes generated by Postgres). The problem is in the semicolon in the time zone offset.
'1997-07-16T19:20:30.45+01:00' is a valid ISO 8601 date and time representation, the likes of which can be generated by the datetime module itself (the isoformat method). The %z strftime directive only recognizes offsets without the semicolon. The Python docs also direct users to inspect the platform strftime documentation; on my system the man page clearly states %z is "... The +hhmm or -hhmm numeric timezone", again, no semicolon support.
Googling around, most common suggestions are to use a third party library (such as dateutil), so either this functionality really doesn't exist in a simple form in the datetime module, or is really undiscoverable. It seems to me ISO 8601 is significant enough (since there are even special methods for it in the datetime module) for it to be parsable, in a non-complex way, by the standard library.
I guess it's interesting to point out a new flag was added to Java's SimpleDateFormat's date and time patterns in Java 7 - the X directive stands for "ISO 8601 time zone" (-08; -0800; -08:00, Z). I've worked with this so I happen to know it off the top of my head.
Thanks in advance for comments.
Tin
Can you show or point us to some examples of the types of dates that cannot be parsed? Do you happen to have the specification for those? "ISO 8601" alone doesn't tell me much, there are many different formats endorsed by the standard, and the standard is hard to read. I can't find it in Wikipedia either, nor in the Postgres docs.
Is it something that a simple regular expression could extract into pieces that can be parsed with Python's strftime?
On Tue, Mar 11, 2014 at 11:46 AM, Tin Tvrtković tinchester@gmail.comwrote:
Hello,
it occured to me today that it's not currently possible to parse ISO 8601 dates using the datetime.datetime.strftime function (I was parsing datetimes generated by Postgres). The problem is in the semicolon in the time zone offset.
'1997-07-16T19:20:30.45+01:00' is a valid ISO 8601 date and time representation, the likes of which can be generated by the datetime module itself (the isoformat method). The %z strftime directive only recognizes offsets without the semicolon. The Python docs also direct users to inspect the platform strftime documentation; on my system the man page clearly states %z is "... The +hhmm or -hhmm numeric timezone", again, no semicolon support.
Googling around, most common suggestions are to use a third party library (such as dateutil), so either this functionality really doesn't exist in a simple form in the datetime module, or is really undiscoverable. It seems to me ISO 8601 is significant enough (since there are even special methods for it in the datetime module) for it to be parsable, in a non-complex way, by the standard library.
I guess it's interesting to point out a new flag was added to Java's SimpleDateFormat's date and time patterns in Java 7 - the X directive stands for "ISO 8601 time zone" (-08; -0800; -08:00, Z). I've worked with this so I happen to know it off the top of my head.
Thanks in advance for comments.
Tin
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido)
On Tue, Mar 11, 2014 at 2:57 PM, Guido van Rossum guido@python.org wrote:
Can you show or point us to some examples of the types of dates that cannot be parsed? Do you happen to have the specification for those?
This is a known limitation. We don't have an strftime code for TZ in hh:mm format. GNU date uses ':z' for this.
Ah, colon, not semicolon. On Mar 11, 2014 12:08 PM, "Alexander Belopolsky" alexander.belopolsky@gmail.com wrote:
>
On Tue, Mar 11, 2014 at 2:57 PM, Guido van Rossum guido@python.orgwrote:
Can you show or point us to some examples of the types of dates that cannot be parsed? Do you happen to have the specification for those?
This is a known limitation. We don't have an strftime code for TZ in hh:mm format. GNU date uses ':z' for this.
Ugh, I need to get into the habit of checking the issue tracker first. Sorry for not doing that this time.
The resolution of that issue is what I'm after, yeah. Also, yes, I meant colon instead of semicolon.
Thanks!
On 11.03.2014 20:10, Guido van Rossum wrote: >
Ah, colon, not semicolon.
On Mar 11, 2014 12:08 PM, "Alexander Belopolsky"
<alexander.belopolsky@gmail.com
mailto:alexander.belopolsky@gmail.com> wrote:
On Tue, Mar 11, 2014 at 2:57 PM, Guido van Rossum
<guido@python.org <mailto:guido@python.org>> wrote:
Can you show or point us to some examples of the types of
dates that cannot be parsed? Do you happen to have the
specification for those?
This is a known limitation. We don't have an strftime code for TZ
in hh:mm format. GNU date uses ':z' for this.
See http://bugs.python.org/issue5207
On 11/03/2014 19:39, Tin Tvrtković wrote:
Ugh, I need to get into the habit of checking the issue tracker first. Sorry for not doing that this time.
The resolution of that issue is what I'm after, yeah. Also, yes, I meant colon instead of semicolon.
Thanks!
On 11.03.2014 20:10, Guido van Rossum wrote: >
Ah, colon, not semicolon.
On Mar 11, 2014 12:08 PM, "Alexander Belopolsky"
<alexander.belopolsky@gmail.com
mailto:alexander.belopolsky@gmail.com> wrote:
On Tue, Mar 11, 2014 at 2:57 PM, Guido van Rossum
<guido@python.org
<mailto:guido@python.org>> wrote:
Can you show or point us to some examples of the types of
dates that cannot be parsed? Do you happen to have the
specification for those?
This is a known limitation. We don't have an strftime code for TZ
in hh:mm format. GNU date uses ':z' for this.
See http://bugs.python.org/issue5207
In case you haven't seen it there's also http://bugs.python.org/issue15873 - whether both need to remain open I've no idea.
-- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language.
Mark Lawrence
This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com
On Tue, Mar 11, 2014, at 14:46, Tin Tvrtković wrote:
'1997-07-16T19:20:30.45+01:00' is a valid ISO 8601 date and time representation, the likes of which can be generated by the datetime module itself (the isoformat method). The %z strftime directive only recognizes offsets without the semicolon. The Python docs also direct users to inspect the platform strftime documentation; on my system the man page clearly states %z is "... The +hhmm or -hhmm numeric timezone", again, no semicolon support.
Maybe Python should switch to platform-independent time parsing/formatting implementations. These can have a %z that accepts this format.
I will incidentally note that as of SUSv7 systems are not required to support %z for strptime at all (nor does it therefore impose any requirement to reject formats including a colon, or 'Z'.)
On Tue, Mar 11, 2014 at 11:46 AM, Tin Tvrtković tinchester@gmail.com wrote:
Hello,
it occured to me today that it's not currently possible to parse ISO 8601 dates using the datetime.datetime.strftime function (I was parsing datetimes generated by Postgres). The problem is in the semicolon in the time zone offset.
'1997-07-16T19:20:30.45+01:00' is a valid ISO 8601 date and time representation, the likes of which can be generated by the datetime module itself (the isoformat method). The %z strftime directive only recognizes offsets without the semicolon. The Python docs also direct users to inspect the platform strftime documentation; on my system the man page clearly states %z is "... The +hhmm or -hhmm numeric timezone", again, no semicolon support.
Googling around, most common suggestions are to use a third party library (such as dateutil), so either this functionality really doesn't exist in a simple form in the datetime module, or is really undiscoverable. It seems to me ISO 8601 is significant enough (since there are even special methods for it in the datetime module) for it to be parsable, in a non-complex way, by the standard library.
There's already an open bug about this (well, technically RFC 3339, but close enough): http://bugs.python.org/issue15873
Cheers, Chris