[Distutils] Basic Markdown Readme Support

Donald Stufft donald at stufft.io
Tue May 3 10:12:57 EDT 2016


> On May 3, 2016, at 9:47 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> On 3 May 2016 at 23:18, Fred Drake <fred at fdrake.net> wrote:
>> My perspective, for what it's worth, is that while I find Markdown a
>> horrible pain, there are a lot of people who pick it up before picking
>> up Python, and tools like GitHub and BitBucket  encourage (and make it
>> easier to add) README.md to a project. For someone who isn't familiar
>> with reStructuredText, it's an easier on-ramp.
>> 
>> So while I'm all for encouraging developers to prefer
>> reStructuredText, I'm in favor of supporting Markdown as a
>> long_description format. The format for a README file just doesn't
>> seem such a big deal that alienating potential community members is
>> worth it.
> 
> Exactly. The lack of support for Markdown README files is mainly a
> matter of a historical quirk of the way the PyPI metadata upload API
> works making this way more work to implement than it seems at first
> glance, the current PyPI codebase being sufficiently fragile that we
> actively avoid changing it, and getting Warehouse to the point of
> being sufficiently feature complete for it to take over primary
> service responsibilities being a long hard slog for the folks working
> on it (it's a lot harder to find volunteers interested in working on
> paying down technical debt than it is to find folks that want to work
> on new user facing features).

This is basically the answer here. It looks like the original post by Nick
Timkovich tried to start a discussion about what this might look like, but
really I think it focused too much on the setup.py API which isn't really the
issue, we can do whatever there, what's more the issue is how is that
represented in the metadata.

Right now we have a singular metadata field which just contains all of the text
of the long_description without any other information about it. The simplest
thing to do is probably to just add a new field, something like::

    Description-Markup: <value>

We'd need to define some values for for it like "txt", "md", "rst" or something
along those lines. I'd suggest extensions so that in the future we can move
the long description into it's own file in the metadata and just move that
value to the file extension (like `DESCRIPTION.[ext]`). We'd also want to
declare what behavior should be expected when that value doesn't exist
(codifying the current behavior of, attempt to register as rst, fall back to
txt).

We'd probably also want some recommendations on what the different types of
tools should do when encountering invalid markup for the declared markup
language and also what they should do when encountering a markup language they
don't know. For the server side (e.g. PyPI) I'd suggest erroring the upload
whenever an invalid or unknown markup is attempted to be uploaded (where a
undeclined markup is never invalid, it just does the fallback) and for anything
clientside it jsut falls back to plaintext.

This has a few benefits:

* No more ugly when plaintext (or markdown!) accidently get rendered wrongly as
  restructeredtext.
* We can hard fail uploads when their rendering is broken, leading people to
  be able to fix the problems instead of ending up with a bunch of broken
  markup all over PyPI.
* We can allow markup languages other than txt and rst.

However, I'm not going to have time or motivation to really work this into a
valid spec or even a fully fleshed out idea. Nor will I be able to handle
implementing this in PyPI or Warehouse right now since I'm primarily focused on
trying to get Warehouse itself deployed for real. I am happy to review PRs and
actual specs though, whether they take this idea or they use a different one.

> 
> However, this SO answer provides some ideas on ways to convert from
> Markdown to reStructured Text when producing the sdist metadata, or to
> derive a checked in .rst file from a README.md file:
> http://stackoverflow.com/questions/10718767/have-the-same-readme-both-in-markdown-and-restructuredtext
> 
> It is also seems plausible to me that a client-side solution could be
> designed that allowed the description metadata stored in the sdist to
> be overridden when uploading to PyPI (i.e. the description in PKG-INFO
> could be Markdown, but the upload tool could use pypandoc to convert
> that to reStructuredText in the uploaded metadata). I'm not sure if
> Donald would be open to that in twine (presumably via an extra to
> avoid having pypandoc as a standard dependency), but client-only
> changes are generally an easier pitch than changes to the interfaces
> between client tools and PyPI.
> 

I actually mostly don’t do much with twine anymore, Ian Cordasco has more or
less taken over maintenance of it so it'd be up to him ultimately. That being
said I think doing it in twine is the wrong layer. Ideally the metadata in
PyPI matches the metadata in the file and people can "recreate" the PyPI
database using nothing but the files [1].

I think if you want to shim over this on the client side, your best hope would
be in setuptools, but I think if someone is motivated enough to actually do
the spec and implementation work we can get proper support landed too.


[1] Ok, this doesn't exactly work because of the dynamic nature of setup.py,
    but in practice you can get close, and we're moving closer to that day.


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20160503/323dd033/attachment.sig>


More information about the Distutils-SIG mailing list