cross posting, since I sent this to mailman-users@, but wanted to distribute over mailman-developers@
Hello,
it would be good, if Hyperkitty integrates better with search engines for public archives. In particular:
• generates sitemap files, containing information about each archive page, created by Hyperkitty, when was it last modified, how often is the webpage
expected to changes (never) — https://gitlab.com/mailman/hyperkitty/-/issues/467 ) . This way, when search engines index Hyperkitty archives, they
will crawl just what changes since the previous crawl, and not everything. Crawling everything repeatedly generates a lot of server load.
• include metadata about when the hyperkitty archive (article) was published — https://gitlab.com/mailman/hyperkitty/-/issues/466 . Then search
results will show (sometimes) the date of the publication
• tell search engines immediately, whenever new public archive weppage is generated — https://gitlab.com/mailman/hyperkitty/-/issues/468 , so that
they can in theory index that webpage immediately.
I have experience on integrating webpages with search engines on the above bullets. If somebody is willing to implement these features in Hyperkitty,
I can answer any questions related to the search engines, not related to the code in Hyperkitty itself.
The biggest friends of search engines are fast loading webpages. Removing the dependencies on jQuery in django-mailman3, Hyperkitty and Postorius
would result somehow faster loading pages.
I can assume that some administrators (=me) refrain for letting search engines index the archives, because the servers have not enough capacities to
handle the huge webcrawling periodic traffic. Implementing the above suggestions shall remove the obstacle letting search engines index public
archives.
Greetings
Дилян