[Tutor] Clarification questions about how Python uses references.

Cameron Simpson cs at cskk.id.au
Fri Jun 25 20:50:06 EDT 2021


On 25Jun2021 19:20, Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
>	But there is no /table/ being indexed by the MD5 hash! So how do you
>locate the original file if given the MD5 hash? File systems that use
>hashes use the file name, and don't hash the file contents (any edit of the
>contents would invalidate the MD5 hash, and require regenerating the hash
>value). The file name stays the same regardless of the edits to the file
>itself, so its hash also stays the same..

The example of a file indexed by its MD5 hash I had in mind was a real 
world example of a client report request, and the corresponding output 
files.  Receive the request, save it to disc for processing under a 
filename based on the MD5 checksum, save that filename in a work queue.  
The worker picks up the filename and makes the reports, likewise saving 
them for the client to collect.

The MD5 here is used as an easy way to pick _unique_ filenames for the 
work request and its results. The request never changes content once 
received.

Not the usage you had in mind, but valid nonetheless. A filename is 
effectively an index into a filesystem. A filename made directly from 
the MD5 of its contents in this case, as a free unique name in a shared 
work area.

This is a similar use case to that for GUUIDs in other contexts.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Tutor mailing list