[Python-Dev] Adding a tzidx cache to datetime

Paul Ganssle paul at ganssle.io
Mon May 13 20:01:39 EDT 2019


> From Marc-Andre Lemburg, I understand that Paul's PR is a good
> compromise and that other datetime implementations which cannot use
> tzidx() cache (because it's limited to an integer in [0; 254]) can
> subclass datetime or use a cache outside datetime.

One idea that we can put out there (though I'm hesitant to suggest it,
because generally Python avoids this sort of language lawyering anyway),
is that I think it's actually fine to allow the situations under which
`tzidx()` will cache a value could be implementation-dependent, and to
document that in CPython it's only integers  in [0; 254].

The reason to mention this is that I suspect that PyPy, which has a
pure-python implementation of datetime, will likely either choose to
forgo the cache entirely and always fall through to the underlying
function call or cache /any/ Python object returned, since with a pure
Python implementation, they do not have the advantage of storing the
tzidx cache in an unused padding byte.

Other than the speed concerns, because of the fallback nature of
datetime.tzidx, whether or not the cache is hit will not be visible to
the end user, so I think it's fair to allow interpreter implementations
to choose when a value is or is not cached according to what works best
for their users.

On 5/13/19 7:52 PM, Victor Stinner wrote:
> Le ven. 10 mai 2019 à 09:22, M.-A. Lemburg <mal at egenix.com> a écrit :
>> Given that many datetime objects in practice don't use timezones
>> (e.g. in large data stores you typically use UTC and naive datetime
>> objects), I think that making the object itself larger to accommodate
>> for a cache, which will only be used a smaller percentage of the use
>> cases, isn't warranted. Going from 64 bytes to 72 bytes also sounds
>> like this could have negative effects on cache lines.
>>
>> If you need a per object cache, you can either use weakref
>> objects or maintain a separate dictionary in dateutil or other
>> timezone helpers which indexes objects by id(obj).
>>
>> That said, if you only add a byte field which doesn't make the object
>> larger in practice (you merely use space that alignments would
>> use anyway), this shouldn't be a problem. The use of that field
>> should be documented, though, so that other implementations can
>> use/provide it as well.
> From Marc-Andre Lemburg, I understand that Paul's PR is a good
> compromise and that other datetime implementations which cannot use
> tzidx() cache (because it's limited to an integer in [0; 254]) can
> subclass datetime or use a cache outside datetime.
>
> Note: right now, creating a weakref to a datetime fails.
>
> Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20190513/c4ea0d59/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20190513/c4ea0d59/attachment.sig>


More information about the Python-Dev mailing list