[Python-ideas] Use lazy loading with hashtable in python gettext module

Tue Dec 18 17:58:56 EST 2018

Le 18/12/2018 à 23:09, Barry Scott a écrit :
> 
> 
>> On 18 Dec 2018, at 09:10, Serge Ballesta via Python-ideas 
>> <python-ideas at python.org <mailto:python-ideas at python.org>> wrote:
>>
>> In a project of mine, I have used the gettext module from Python 
>> Standard Library. I have found that several tools could be used to 
>> generate the Machine Object (mo) file from the source Portable Object 
>> (one): pybabel (http://babel.pocoo.org/en/latest/), msgfmt.py from 
>> Python tools or the original msgfmt from GNU gettext.
> 
> snip
> 
>> Before going further, I would like to know whether implementing lazy 
>> access through the hash table that way seems to be a interesting 
>> improvement or a dead end
> 
> I think about it this way.
> 
> Based on the largest project I have worked on that was internationalised 
> into
> 14 languages the British English text translated to American English 
> (en-US) created a 350KiB MO file.
> 
> The largest mo file was for Thai (th-TH) at 680KiB.
> 
> Is it worth the complexity of the hash code to save that memory?
> 

The hash code is not that complex. The main problem was that it is not 
documented except in the source code.

> Will the hash code improve the load time?
> We never noticed the load time and we reloaded the MO on ever web page 
> access.
> 
> As for FDs it uses 1 and on my linux system I have 1.6M to play with.
> 
> Barry
> 

What make me think that it deserves a try is that it is the way it is 
implemented in original GNU gettext, and that a TODO note said it should 
be considered. But the documentation also explains that the hash table 
is optional...

Serge