Slide deck for my presentation at MacAD.UK 2018. I this talk I cover how I use Python docstrings, reStructuredText, and Sphinx to generate human readable documentation from my code base, and then automate the creation of that documentation with ReadTheDocs.org.
**Some of the slides contained video that is not shown in this copy.**
3. Your Code Should D o c u m e n t Itself
Today, I’m going to talk about documentation. Specifically, a way for you to approach documentation for Python packages and applications that you create. For that,
we’re going to talk about some conventions and some tools that are going to make it possible for your code to document itself.
4. Documentation is Important(?)
“We should document that.”
— {Insert Name}
If you’re here, you probably acknowledge that documentation is important.
There is a common saying in our profession.
5. Documentation is Important(?)
"Writing is easy. All you do is stare at a
blank sheet of paper until drops of blood
form on your forehead.”
— Gene Fowler
And it isn’t that writing documentation is hard to do.
Sometimes it isn’t that we tell ourselves, “I should probably write down these changes I made, but don’t feel like doing it now”.
I think what we really tell ourselves, more often than not, is, “I should write down these changes, but I don’t want to jump over to my wiki right now.”
To me, it’s more likely that I won’t keep up on my documentation if I’m having to jump back and forth between different systems. It interrupts my focus, and that makes
me lose productivity.
6. Documentation as Code
"Writing is easy. All you do is stare at a
blank sheet of paper until drops of blood
form on your forehead.”
— Gene Fowler
Now, if my documentation lives with my code, and lives inside that code too, then that mental barrier goes away. If I’m working on updates to a script or an application
and my documentation is right there in front of my as I work it starts to becomes a much more seamless experience.
I’m a Python developer, and Python has a convention for this already.
7. The MacTroll Slide
I’m a Python developer, and Python has a convention for this already. If you’re here I think you knew that this was going to be a Python-centric topic.
Yesterday, when Joel was showing off all the aweseomness that is NoLoad, I don’t think it would have killed him to say something nice about Python. Like, he could have
mentioned how Python has 5x the presence on StackOverflow opposed to Swift. Something like that.
8. DocStrings
“A docstring is a string literal that occurs
as the first statement in a module,
function, class, or method definition.”
— PEP-257
But, back to the topic at hand. The convention I mentioned before is the docstring. What is a docstring? This is the definition right out of PEP-257.
PEP stands for Python Enhancement Proposals
This is easier to understand when seeing code, and if you’ve written Python before you’re probably already familiar with these.
9. DocStrings
A docstring provides a description for any module, function, class, and method in your code.
PEP-257 is focused entirely on Docstring Conventions: how they are written and what they contain at a high level. As with most PEPs, these are not laws you have to
follow. Conventions are just that.
The basic docstring is a single string directly after the declaration. This string is to be encapsulated in triple double-quotes.
10. DocStrings
A multi-line docstring should start after the first step of triple quotes and end with the last set of triple quotes on their own line. No blank lines at the start or the end of a
docstring.
In our updated hello_world() we’ve added an argument for a ‘name’, giving two potential outcomes, and added to the description detailing this.
Now if someone is inside our code they can read the docstring to quickly learn what our function does, but this simple docstring can be expanded further.
11. DocStrings
I’ve changed hello_world() to return strings instead of printing them now. I’ve also added to our docstring. There are now definitions for the ‘name’ argument and what is
being returned by the function.
This is reStructuredText, and this markup is used by a tool called Sphinx to generate documentation from our code.
12. DocStrings
This presentation is focused on using reStructuredText. That’s what I use, but I want to call out that there is an alternative style which came out of Google that looks like
this.
If you search for “Google Style Docstrings” you can learn more about this syntax.
13. reStructuredText
reStructuredText is a markup language not too dissimilar from Markdown, but it is geared specifically towards code documentation and has a lot of really powerful
features.
Still, the basic formatting available should be familiar to anyone who has used Markdown.
14. reStructuredText
reStructuredText is a markup language not too dissimilar from Markdown, but it is geared specifically towards code documentation and has a lot of really powerful
features.
Still, the basic formatting available should be familiar to anyone who has used Markdown.
17. reStructuredText
Headers are interesting. There are no rules about what characters define a header, just that the header character line must equal the width of the header text. The
overlines are optional.
The first character used for a header line becomes H1. The second encountered becomes H2, and so on. You get to determine header styles as a part of your
documentation.
The snippet on the right is a recommendation from the Python docs on what to use.
18. reStructuredText
Headers are interesting. There are no rules about what characters define a header, just that the header character line must equal the width of the header text. The
overlines are optional.
The first character used for a header line becomes H1. The second encountered becomes H2, and so on. You get to determine header styles as a part of your
documentation.
The snippet on the right is a recommendation from the Python docs on what to use.
19. reStructuredText
Code blocks are simple. You can end a paragraph with two colons which will then render the following indented lines as preformatted text.
By default, the highlighting will be for Python code.
20. reStructuredText
Code blocks are simple. You can end a paragraph with two colons which will then render the following indented lines as preformatted text.
By default, the highlighting will be for Python code.
21. reStructuredText
Alternatively, you can use the code-block directive and specify the language or format for the highlighting. This is a style I prefer using as it’s more readable at a glance.
22. reStructuredText
Coming back to the hello_world() function, let’s focus in on the reStructuredText that was added to the docstring.
23. reStructuredText
There are keywords that we included in the docstring deliminted by colons. These are called “info fields”. They’re a part of a Sphinx Domain. A domain is collection or
reStructuredText directives and roles: which are a part of the reStructuredText syntax for describing code objects.
The ‘param’ key defines a parameter for our function, class, or method the docstring belongs to. After declaring that we’re describing a parameter, we can provide the
type of object that the parameter should be. This one is a basic string type, but you can define any Python object, even from third party modules, as a type. The third part
if the parameter’s label.
After the closing colon we can provide a user friendly description.
24. reStructuredText
This one line definition can be broken out into two for the sake of readability using the ‘type’ key. Again, you provide the label for the parameter that you are defining. This
is a better option when you are referencing objects from other modules that have longer names.
26. reStructuredText
You aren’t limited to specifying a single type, or just specifying a generic type for a collection.
Here we are defining types for lists, dicts, and tuples; and also what type of object they contain.
We can define multiple accepted or returned types by using the ‘or’ operator.
27. reStructuredText
Let’s take a look at a fully documented Python class. I’m going to be referencing a project I’ve been working on called ODST. In it is this AESCipher class.
28. reStructuredText
In here there are two new info fields: ‘attr’ and ‘raises’.
The ‘attr’ defines an attribute for an object. The block_size is a class attribute that’s defined by a value from the cryptography module. You only provide a description for
this, not a type.
29. reStructuredText
‘raises’ lets you define what exceptions can be raised and a description for why they occur. You can have multiple ‘raises’ statements in your docstring: one for each type
of exception.
30. reStructuredText
This is a private method for the class for generating cipher objects that are used by the encrypt() and decrypt() methods.
In fact, inline with our docstring test are Sphinx roles that point to those functions. You’re going to see how this matters in a minute.
33. reStructuredText
The return for this method isn’t one of the standard objects. This is an example of referencing an object from an imported module as a type.
36. Sphinx
This is where we shift to talking about Sphinx. Sphinx is a tool for generating documentation from code using reStructuredText. It was originally created for the docs for
python.org. It has since expanded, but at the core it is still a Python-centric documentation tool.
The page you see here contains the docstrings of the AESCipher class. What this page looks like in reStructuredText is…
37. Sphinx
These new directives are a part of a Sphinx extension called autodoc. Autodoc imports code in order to read the docstrings and render them with the page. Sphinx, like
many other tools, is extendable.
38. Sphinx
Let’s walk though setting up a project with Sphinx so we can build our docs. I’m going to continue using my ODST project as the example.
Sphinx is available in the Python Package Index. Install with pip. There are four commands that get installed with it:
sphinx-apidoc, sphinx-autogen, sphinx-build, sphinx-quickstart - We’re only going to be using the quickstart option.
In the project’s directory, make a new directory called “docs”.
Then, run the sphinx-quickstart.
39. Sphinx
sphinx-quickstart is an interactive wizard to configure your project’s docs.
For each question, the option in brackets is the default and you can hit enter to continue.
The docs root path is going to be the current directory (the “.”).
40. Sphinx
After setting the project name and author, sphinx-quickstart asks for the project’s version. I’m going to leave this blank and show you the way I handle version setting
later.
Again, I’m continuing to stick with the default options. Of course, set the docs to whatever language you write them in.
41. Sphinx
Note here that there is an option for an epub builder. My presentation is focused on generating HTML formatted docs, but keep this in mind if you’re looking for an
alternative means of distribution.
Now, we’re going to be asked about a lot of common extensions to Sphinx that we may want to include. The three I have selected for ODST at autodoc, todo, and
coverage. I already touched on what autodoc does. Todo renders “todo” directives similarly to “info” or “warning” messages. The coverage extension is a handy tool to
ensure all the modules, classes, and functions that you have included in your docs are documented. We’ll do an examples of these a little later.
42. Sphinx
The Makefile option is really handy. I’ll be showing you how to use this opposed to sphinx-build. I’m opting out of a Windows command file, but you might want to
include this if someone is working on the project on a Windows machine.
This completes the wizard and sphinx-quickstart will generate the files and directories required.
44. Sphinx
Before editing the index.rst, file and creating new pages, we need to make a few edits to the conf.py file. This file is where all of our selections from sphinx-quickstart
went.
First, because we’re using autodoc, we need to uncomment these three lines and update the absolute path to be the parent of of docs/ directory. This will allow Sphinx to
be able to locate our module when building.
45. Sphinx
A little further down we see where those version strings would have gone. While we could have hard-coded this, my preference is to import the project’s module and read
it’s __version__ dunder in. That way whenever I increment the value it is automatically reflected in the docs.
47. Sphinx
Now we can go to the index.rst file and start populating content.
At the top here is a comment. Two periods followed by a space denotes a hidden comment that won’t be output when we build our documentation. Like Python, blocks
of text are kept together through indentation. The comment ends at the first dedented line.
48. Sphinx
These will display special colored text boxes. ..info is a little more subtle while ..warning stands out. Anywhere we use ..todo will also render like it was ..info because we
loaded that Sphinx extension.
49. Sphinx
These are Table of Contents trees. For this project, I have a number of subdirectories that contain more .rst files. In my main page I split them apart visually by having
multiple ..toctree declarations.
51. Sphinx
The rest of this page is all content that uses the normal roles and directives I covered before. This is all stuff that doesn’t need to live inside the code itself, but still exists
within my project repository.
52. Sphinx
Back in the command line, I can generate the documentation in two ways.
The first is to use sphinx-build.
The second is to use the Makefile
53. Sphinx
By default, these commands will only work on new or modified files and leave the rest alone. If you want to do a clean build of the docs, you can do so like this.
54. Sphinx
The HTML gets output to our _build/ directory and we can see how everything rendered.
Now, this is a little bland. The default theme for Sphinx is the Alabaster theme. It’s based of the themes developed for the Flask and Requests projects.
While nice, and minimalist, it’s not to my taste. There is another theme that I prefer, and is my default for all my projects.
55. Sphinx
This is the Read the Docs theme. I prefer it to alabaster for the better contrast of colors, especially with the message boxes and sidebar. You can also see the version of
our docs in the upper-left corner instead of just inside the page title.
56. Sphinx
Switching Sphinx themes is very simple to do. The theme is installed with pip like any other Python package.
In our conf.py file, we go to the html_theme line and switch out “alabaster” for “sphinx_rtd_theme”.
57. Sphinx
Back to the security page from earlier, I had added in another sub-module, and a todo note about adding in the authentication decorators. But, there’s nothing under my
Passwords section.
58. Sphinx
The changes to the .rst file look fine. It should be automatically pulling in the docstrings from my the passwords module.
59. Sphinx
The reason why nothing is showing is because I forgot to document the password functions. The coverage extension I added in earlier can help identify anywhere in my
docs where I’m referencing code that’s undocumented.
Using sphinx-build, I can specify coverage as the build type, target the current directory (which is my docs), and an output directory inside _build/
Printing out the python.txt file shows every module and what objects were left undocumented. A really useful tool as your project expands and your documentation
increases.
60. Sphinx
There’s another useful extension I want to highlight. I’ve written a number of Flask applications, and providing easy to read documentation for the APIs those apps
expose is really critical for others to use them.
This page for ODST’s Admin API is generated using that extension.
61. Sphinx
Like themes, Sphinx extensions are installed through pip. The extension for our API documentation adds an HTTP Domain.
In the conf.py file, we add three modules from this extention under the original three added in the wizard.
62. Sphinx
This is the full .rst file for the Admin API page.
There are two directives here from the HTTP domain that was added. ..qrefflask generates the table of contents for the endpoints at the top of the page. ..autoflask
One thing to note: in a Flask app, everything is represented by an application object that you create. In my programs I use a factory pattern where the app object is
created by a function. In these lines I’m calling the factory function so the app object can be fed into Sphinx.
63. Sphinx
In the Python file for the Admin API routes, each route has a function associated with it. Each one of those functions’ docstrings contains all the information for the HTTP
request.
64. Sphinx
Some new roles again brought in by the HTTP domain. ..quickref provides a group name and a description separated by a semi colon. You can group your routes
together in the documentaiton by giving them all a matching group name.
..sourcecode is like code-block but this will syntax highlight HTTP syntax. Below where we state the Accept header there’s a JSON example, and that will be highlighted
correctly within the block.
65. Sphinx
Again, another route, but this time a POST with a JSON example under the Content-Type header. In the quickref I’m using the same group name, so these routes will be
listed together in that generated table of contents.
66. Sphinx
And the response.
Because everything is in the docstring for the route’s function, I don’t need to jump away from the code as I made changes in order to update my documentation.
67. Sphinx
There is still a gap in this process. The code has well written docstrings, the code is referenced in Sphinx to generate our HTML documation, but generating that
documentation is still a manual process. What is needed, is a process for creating our documentation and posting it in a readable location for our users.
68. Read the Docs (.org)
This is where Read the Docs comes in.
69. Read the Docs (.org)
Read the Docs is a free documentation hosting site that works with with source control systems and Sphinx.
70. Read the Docs (.org)
Read the Docs works directly with GitHub to create hooks into your projects that you setup.
Once linked, every commit you make to your repository will trigger a documentation build here. Your docs will update every time you push new code. Thus, fulfilling the
goal of our code documenting itself.
71. Read the Docs (.org)
This can be taken further by having documention builds on separate branches. If you use a branching strategy where you treat the “master” branch as your production
code, and you commit working branches to a “develop” branch, you can setup each branch as its own version in Read the Docs and serve pre-release documentation.
72. Read the Docs (.org)
On hosted documentation, there is a green link in the lower-left corner that pops up a menu. Here, the user can switch betwen the different versions, and also download
copies of the documentation in HTML, PDF, and Epub.
73. Read the Docs (.org)
If you can’t use the native import features on Read the Docs, you can setup a repository with the service manually. When you do so, you can set a “Generic API Incoming
Webhook” for the docs. Whenever you are ready for a build, make an HTTP POST request to this URL using the provided token. This wil trigger a pull of the latest code
and start the build job.
74. Read the Docs (.org)
?
And if you’re not using public repositories and can’t use a public documentation host?
In that case, you’re going to need to roll your own. The good news is, you can still use a build tool like Jenkins or Bamboo to perform the build of your documentation. As
an extra step, you’ll need to take that html build and then copy it somewhere that it can be served from. A basic HTTP server can suffice for that if you can “scp” the files
into it.
75. Read the Docs (.org)
Another alternative is embedding the generated documentation with your deployed application and making it a part of the service. In this example, there is a
“Documentation” button I’ve added to the ODST GUI that takes you to the generated Sphinx docs.
76. Would you like to know more?
Sphinx Tutorial
http://www.sphinx-doc.org/en/stable/tutorial.html
reStructuredText Primer
http://www.sphinx-doc.org/en/stable/rest.html
Sphinx Markup Constructs
http://www.sphinx-doc.org/en/stable/markup/index.html
Sphinx Domains
http://www.sphinx-doc.org/en/stable/domains.html