#!/usr/bin/env python3
###############################################################################
#
# Copyright (c) 2025, Anders Andersen, UiT The Arctic University of
# Norway. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# - Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# - Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in
# the documentation and/or other materials provided with the
# distribution.
#
# - Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
# COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#
###############################################################################
R"""A Python module for generating Python module documentation
The `pypimdoc` module generates documentation for a Python module
based on a README template file and the documentation text strings
from the Python source code of the module. The README template file
is a Markdown text document with the additions of what we call *module
documentation macros*, or just *MD-macros* for short. These macros
are on separate lines inside HTML comments:
```
<!-- doc(PyPiMDoc.process_template, hlevel=2) -->
```
In the example above, the MD-macro line will be replaced by the
documentation string of the `process_template` method of the
`PyPiMDoc` class with a level 2 heading. If the module is used in
inline-mode (and not the default HTML comment block-mode), the
MD-macros are placed inline in the markdown template file and not
inside HTML comments:
```python
doc(PyPiMDoc.process_template, hlevel=2)
```
Other available MD-macros includes the heading macro `h` (to create at
heading), the table of content macro `toc` (to create table of
content), the Python code evaluation macro `eval` (to insert the
result of the evaluated Python code), the shell command macro `cmd`
(to insert the result of the executed shell command), and the help
text macro `help` (to insert the result of the help commad `-h` when
the module is used as a console script).
The `pypimdoc` module also provides an MD-macro to loop over a list:
```
<!-- for_in(x, [_ismethod, _get_doc], "doc(x, hlevel = 3)") -->
```
The example above has the same effect as inserting the following lines
in the README template:
```
<!-- doc(_ismethod, hlevel = 3) -->
<!-- doc(_get_doc, hlevel = 3) -->
```
A few MD-macros are also available as multi-line HTML comment
blocks. The `for_in` MD-macro is typically more often used as a
multi-line HTML comment block:
```
<!-- for x in [_ismethod, _get_doc]:
doc(x, hlevel = 3)
-->
```
If the `pypimdoc` module is used in inline-mode, the equivalent code
of the example above will be:
```python
for x in [_ismethod, _get_doc]:
doc(x, hlevel = 3)
end_for
```
Currently, the MD-macros `for_in` and `code` can be used as
multi-line HTML comment blocks (or inline-mode blocks). This is an
example of such a multi-line HTML comment block for `code`:
```
<!-- code:
def rm_md(name: str) -> str:
return name.split(".")[-1][4:]
def md_mds(title: str) -> str:
return title.replace("Method", "MD-macro")
-->
```
The inline-mode version of the code block example above will be as follows:
```python
code:
def rm_md(name: str) -> str:
return name.split(".")[-1][4:]
def md_mds(title: str) -> str:
return title.replace("Method", "MD-macro")
end_code
```
The MD-macro `code` and the example of code blocks above populates the
name space where the MD-macros are executed. The consequence is that
the functions (and variables) defined in such code blocks can be used
in the arguments of MD-macros. For example, based on the code block
above, the documentation of the MD-macro `doc` can be generated using
the MD-macro `doc` itself with the two help functions defined in the
code blocks above:
```
<!-- doc(PyPiMDoc._md_doc, hlevel = 2, name_transform = rm_md, title_transform = md_mds) -->
```
The `name_transform` argument `rm_md` will change the name of the
method from `"PyPiMDoc._md_doc"` to `"doc"`, and the `title_transform`
argument `md_mds` will change the the title `"Method doc"` to
`"MD-macro doc"` (see the implementation of these help functions in
the code blocks above).
The `pypimdoc` module also provides some predefined help functions
available in the name space where the MD-macros are executed. These
predefined help fuctions can also be used in the arguments of
MD-macros. For example, to create a level-two header with the title
from the title part (the first line) of the documentation string of
`_is_method`, you can use the help function `mdoc_title` in the
arguments of the MD-macro `h`:
```
<!-- h(mdoc_title(_is_method), hlevel = 2) -->
```
The `mdoc_title` returns the title part (the first line) of the
documentation string of the object provided as the argument; in this
case the function `_is_method`. To create a level-one header with the
title string of the module (the first line of the documentation string
of the module), you use the MD-macro `h` in the following way:
```
<!-- h(mdoc_title(), hlevel = 1) -->
```
In addition, the two MD-macros `eval` and `cmd` are also made
available in the name space where the MD-macros are executed and can
be used in the argument part of other MD-macros.
To produce the the markdown documentation of a module where the
documentation strings are written in markdown, you can use the console
script `pypimdoc`:
```bash
pypimdoc -t README.template -o README.md mymodule.py
```
The command above generates the markdown documentation of the module
`mymodule` in the `README.md` file based on source code of the module
in the file `mymodule.py` and the README template file
`README.template`. If the `mymodule` provides one class `MyClass`, the
following could be a complete example of the README template file:
```
<!-- doc(hlevel = 1) -->
<!-- doc(MyClass, hlevel = 1, complete = True) -->
```
The first `doc` MD-macro creates a level-one heading with the title
(first line) of the module documentation string followed by the body
of module documentation string. The second `doc` MD-macro creates
the complete documentation of the class `MyClass` with the constructor
and all public methods (methods with names not starting with `_`). A
level-one heading is added to the start where the title is the title
(first line) of the documentation string of the class `MyClass`. The
documentation for each public method is from the documentation string
of each of these methods, and a sub-heading is added for each of these
methods.
"""
#
# Some useful values
#
# Current version of module
version = "1.12"
# The produced-by text
produced_by = "This documentation is generate using the `pypimdoc` module"
# The where-to-find text
where_to_find = "Available from [PyPi](https://pypi.org/project/pypimdoc/)"
#
# Import Python modules used
#
# Import standard modules
import sys, re
import urllib.parse
import importlib.util
from pathlib import Path
from io import TextIOBase, StringIO
from inspect import signature, getdoc, isclass, ismethod, isfunction, ismodule
from collections.abc import Callable
# Use subprocess to perform the command line operations
import subprocess
# Import some `pygments` stuff
#from pygments import highlight
#from pygments.lexers.python import PythonLexer
#from pygments.util import ClassNotFound
#from pygments.styles import get_style_by_name
#from pygments.formatters import HtmlFormatter, LatexFormatter
#
# Regular expressions used by the module
#
# Blocks
_block_begin = r"<!--"
_block_end = r"-->"
_block_end_compiled = re.compile(rf"^\s*{_block_end}\s*$")
# Match `pypimdoc` MD-macros in README templates, like the line
# `doc(PyPiMDoc, hlevel = 1, complete = True)`
_md_macro_re = r"(?P<macro>\w+)\((?P<args>.*)\)"
_md_macro = {
"inline": re.compile(rf"^\s*{_md_macro_re}\s*$"),
"block": re.compile(
rf"^{_block_begin}\s*{_md_macro_re}\s*{_block_end}\s*$")
}
_md_name_arg = re.compile(r"[\w \t='\"]*name=(?P<name>['\"]\w+['\"]).*")
# Loop
_md_loop_re = r"for\s+(?P<var>\w+)\s+in\s+(?P<listexpr>.+)\s*:"
_md_loop_inline_end = r"end_for"
_md_forloop = {
"inline": {
"begin": re.compile(rf"^\s*{_md_loop_re}\s*$"),
"end": re.compile(rf"^\s*{_md_loop_inline_end}\s*$")
},
"block": {
"begin": re.compile(rf"^{_block_begin}\s*{_md_loop_re}\s*$"),
"end": _block_end_compiled
}
}
# Code block
_md_code_re = r"code\s*:"
_md_code_inline_end = r"end_code"
_md_code = {
"inline": {
"begin": re.compile(rf"^\s*{_md_code_re}\s*$"),
"end": re.compile(rf"^\s*{_md_code_inline_end}\s*$")
},
"block": {
"begin": re.compile(rf"^{_block_begin}\s*{_md_code_re}\s*$"),
"end": _block_end_compiled
}
}
# Match <class>.<method>, like `PyPiMDoc.process_template`
_cmnames = re.compile(r"(?P<class>\w+)\.(?P<method>\w+)")
# Match module file name <name>.py, like `pypimdoc.py`
_pysrcname = re.compile(r"^(?P<name>\w+)\.(?P<ext>py)$")
# Match a header (empty line followed by title ending with colon
# followed by empty line)
_margsheader = re.compile(r'\n\s*\n([\w /]+:)\n\s*\n')
# In-line code starts and ends with lines starting with three single
# back-quotes
_inline_code = re.compile(r'^```')
# Match a markdown heading
_md_heading = re.compile(r"^(?P<level>#+)\s*(?P<title>.+)$")
#
# Help functions
#
# Matches for the `help2md` function (sol = start of line)
import re
_sol_lc = re.compile(r"^[a-z].*")
_sol_usage = re.compile(r"^Usage:")
_sol_ws_rest = re.compile(r"^ +.*$")
_sol_empty = re.compile(r"^$")
_sol_descr = re.compile(r"^[a-zA-Z_0-9][a-zA-Z_0-9 ]+.*[^:]$")
_sol_args = re.compile(r"^[OP][a-zA-Z_0-9 ]+:$")
_py_fn = re.compile(r"[a-z]+[.]py")
_single_quoted = re.compile(r"'[^']+'")
_sol_ten_ws = re.compile(r"^ ")
_cont_line = re.compile(r"` \| ")
_sol_two_ws = re.compile(r"^ ")
def help2md(help_msg: str) -> str:
R"""Convert a help message to markdown text
Convert a command help message (the output from a command when the
`-h` flag is given) to a valid and well-formated markdown text.
This function is tailored for the help messages produced by Python
programs using the `argparse` module.
Arguments/return value:
`help_msg`: The help message to convert
`returns`: The markdown text
"""
# Initialize help variables
usage: bool = False
descr: bool = False
options: bool = False
prev: str = ""
nr: int = 0
md_txt: str = ""
# Parse each line of `help_msg`
for line in help_msg.splitlines():
# Count lines
nr += 1
# Use `match` if matching the beginning of line, and `search`
# to match inside line
# Uppercase first character in paragraphs
# /^[a-z]/
if _sol_lc.match(line):
line = line[0].upper() + line[1:]
# Initialize usage section (and optional first usage line)
# /^Usage:/
if _sol_usage.match(line):
usage = True
line = re.sub(r"^Usage: +", "\n```bash\n", line)
line = re.sub(r"^Usage:$", "\n```bash", line)
utxt = "\n**Usage:**\n" + line
continue
# Format usage code
# usage && /^ +.*$/
if usage and _sol_ws_rest.match(line):
line = re.sub(r"^ +", " ", line)
utxt += line
continue
# Close usage code if after usage
# usage && /^$/
if usage and _sol_empty.match(line):
usage = False
descr = True
utxt += "\n```"
continue
# Close options
# options && /^$/
if options and _sol_empty.match(line):
options = False
# Description? (if so, first text after usage)
# descr && /^[a-zA-Z_0-9][a-zA-Z_0-9 ]+.*[^:]$/
if descr and _sol_descr.match(line):
descr = False
prev = "*" + line + "*"
line = utxt
# Initialize options/positional-arguments section
# !usage && /^[OP][a-zA-Z_0-9 ]+:$/
if (not usage) and _sol_args.match(line):
if descr: descr = False
options = True
line = "**" + line + "**\n\nName | Description\n---- | -----------"
# Remove .py from command
# /[a-z]+[.]py/
if _py_fn.search(line):
line = re.sub(r"[.]py", "", line)
# Substitute quote with backquote
# /'[^']+'/
if _single_quoted.search(line):
line = line.replace("'", "`", 2)
# Join continuation lines with previous line
# /^ /
if _sol_ten_ws.match(line):
# options && (prev !~ /` \| /)
if options and not _cont_line.search(prev):
line = re.sub(r"^ *", "` | ", line)
else:
line = re.sub(r"^ *", " ", line)
prev += line
continue
# Format arguments/options table
# !usage && /^ /
if not usage and _sol_two_ws.match(line):
line = re.sub(r"^ +", "`", line)
line = re.sub(r" +", "` | ", line)
# Initialize buffered line
# NR == 1
if nr == 1:
prev = line
# Print line (one line buffered)
# NR > 1
else:
md_txt += prev + "\n"
prev = line
# END
md_txt += prev + "\n"
return md_txt
# A few one-liners
_title = lambda obj: obj[0]
_body = lambda obj: obj[1]
_combined = lambda obj: f"{obj[0]}\n\n{obj[1]}"
def _get_nested_attr(ns, attr_str: str) -> object:
R"""Get a nested attribute
Return the named nested attribute `attr_str` from the namespace
`ns`. For example, if `attr_str` is `"a.b"`, return the attribute
`b` of `a`.
Arguments/return value:
`ns`: Name space to find nested attribute in
`attr_str`: Nested attributed as a string using dot notation
`returns`: The attribute named in `attr_str`
"""
attr = ns
nested_attr = attr_str.split(".")
for a in nested_attr:
attr = getattr(attr, a)
return attr
def _mkid(txt: str, idlist: list, max_length: int = 20) -> str:
R"""Make a valid id or reference
Create a valid and unique HTML id/reference from the text string
`txt`. The text string is tyically a title or a Python object
name.
Arguments/return value:
`txt`: The text to be transformed to an id
`idlist`: A list of used ids
`max_length`: The maximum length of the id
`returns`: The new unique id
"""
# Create a quoted (safe) id and start with that one as the new id
qid = urllib.parse.quote_plus(txt[:max_length])
nid = qid
lqid = len(qid)
# Continue until we have a unique id
i = 1
while nid in idlist:
# Count and create a numer to append (to ensure uniqueness)
num = str(i)
# Ensure that the id is not longer than `max_length`
newl = lqid + len(num)
if newl > max_length:
rl = newl - max_length
nid = qid[:-rl] + num
else:
nid = qid + num
# Increase counter
i += 1
# Add new unique id to id list and return the new id
idlist.append(nid)
return nid
def _ismethod(attr: object) -> bool:
R"""A more relaxed implementation of `ismethod`
This version of `ismethod` will also return `True` if it is not in
an instance of the class of the method. The trick (that might give
false positives) is to check that the function's long name
(`__qualname__`) is a nested name (with a dot).
Arguments/return value:
`attr`: The object we are verifying is a method
`returns`: `True` if `attr` is a method
"""
if ismethod(attr) or isfunction(attr):
cmmatch = _cmnames.match(attr.__qualname__)
if cmmatch:
return True
return False
def _getdoc(attr: object) -> str:
R"""Extended get documentation of attribute
This extended `getdoc` function will first try to return the
documentation string of the attribute, and if that is not
available, the related (possibly multiline) comment.
Arguments/return value:
`attr`: The object to get the doc string from
`returns`: The documentation string
"""
doc = getdoc(attr)
if not doc:
try:
doc = getcomments(attr)
except:
doc = None
if not doc:
if hasattr(attr, "__name__"):
m = f" from {attr.__name__}"
else:
m = ""
raise PyPiMDocError(
f"Unable to get documentation string (or comment){m}")
return doc
def _signature(attr: object) -> str | None:
R"""Get signature of function or method as a text string
Returns the signature of a function or method. If it is a method,
`self` is removed from the signature. If it is not a function or
method, `None` is returned.
Arguments/return value:
`attr`: The object to get the signature of
`returns`: The signature of the function or method as a text
string, or `None` if `attr` is not a function or a method
"""
if isfunction(attr):
sig = str(signature(attr))
if _ismethod(attr):
if "(self, " in sig:
sig = sig.replace("self, ", "", 1)
elif "(self)" in sig:
sig = sig.replace("self", "", 1)
return sig
return None
#
# Exceptions/errors by the module
#
class PyPiMDocError(Exception):
R"""Any error in the `pypimdoc` module"""
def __init__(self, errmsg: str):
self.errmsg = errmsg
#
# The main class of the module
#
class PyPiMDoc:
R"""The Python module documentation class
The class implementing the different MD-macros used in the
markdown template for the documentation of a Python module.
The most common usage of the module is as a console script. As a
consequence, the users of the module seldom need to use this class
themselves.
"""
def __init__(
self,
filename: str,
name: str = "",
base_heading_level: int = 1,
toc_begin: int = 1,
toc_end: int = 3):
R"""Initialize a Python module documentation object
Initialize a Python module documentation object, including
loading the Python module (Python source code) and prepare the
document generation.
Arguments:
`filename`: The file name of the module to document
`name`: The name of the module (default generated from the
`filename`)
`base_heading_level`: All generated headings are at this level
or above (default 1)
`toc_begin`: Include items in table of contents from this
level (relative to `base_heading_level`, default 1)
`toc_end`: Include items in table of contents to this level
(relative to `base_heading_level`, default 2)
"""
# Initiate object values from the constructor arguments (or defaults)
self.filename = filename
self.name = name
self.base_heading_level = base_heading_level
self.toc_begin = base_heading_level + toc_begin - 1
self.toc_end = base_heading_level + toc_end - 1
# The documentation can contain a set of table of contents
self.mktoc = set()
# Save toc items here (for each toc set)
self.tocpart = {}
self.doc_tocpart = {}
# A list of used ids (to ensure the we generate unique ids)
self.idlist = []
# How different level of headers are created (pre, post), 0
# means no header
self.hmarkers = [
("", ""), # 0
("# ", ""), # 1
("## ", ""), # 2
("### ", ""), # 3
("#### ", ""), # 4
("**", "**"), # 5
("*", "*")] # > 5
# Name of the module to document (either given or from the file name)
if not self.name:
mpn = _pysrcname.match(self.filename)
if mpn:
self.name = mpn["name"]
else:
raise PyPiMDocError(
f"Unable to determine module: {self.filename}")
# Load the module to document
spec = importlib.util.spec_from_file_location(self.name, self.filename)
self.module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(self.module)
# Make namespaces used by the MD-macros
self.mgns = vars(self.module)
self.mlns = {
"mdoc_doc": self._mdoc_doc,
"mdoc_title": self._mdoc_title,
"mdoc_body": self._mdoc_body,
"eval": self._md_eval,
"cmd": self._md_cmd
}
# Are the names above safe to use?
# List all MD-macros (from this class starting with "_md_")
self.md_macros = _list_md_macros(
self, rm_pre = True, qualname = False, sort_order = None)
def _get_real_hlevel(self, hlevel: int) -> int:
R"""Calibrate levels with the base heading level
Calibrate all references to heading levels by added the base
headings level to the given heading level (except when the
given heading level is zero).
Arguments/return value:
`hlevel`: The given heading level
`returns`: The calibrated (adjusted) heading level
"""
if hlevel > 0:
hlevel += self.base_heading_level - 1
return hlevel
def _get_title_and_doc(self, obj: object | None = None) -> tuple:
R"""Split title and body of documentation string
The first line could be the title of the documentation string if
it is a single line of text followed by an empty line.
Arguments/return value:
`obj`: An object to get the documentation string from, or `None`
meaning the documentation string of the module
`returns`: If the first line of the documentation string of the
object is separate, return the first line and the rest of the
documentation string as a two-tuple, otherwise retrun `None` and
the unchanged documentation string as a two-tuple
"""
# If no object, it is the module
if not obj:
obj = self.module
# Get documentation string from object
try:
doc = _getdoc(obj)
except:
doc = None
title = None
# If documentation, try to split it into a title and a body
if doc:
doclist = doc.strip().splitlines()
if (len(doclist) > 1 and (title:=doclist[0].strip())
and (doclist[1].strip() == "")):
doc = "\n".join(doclist[1:]).strip()
else:
title = None
# Return title and body of documentation string
return title, doc
def _raw_mdoc(self, obj: object | None, sel: Callable) -> str:
return sel(self._get_title_and_doc(obj))
def _mdoc_doc(self, obj: object | None = None) -> str:
R"""The documentation string of an object
The fuction returns the complete documentation string of the
object, including the title (the first line), the following
empty line and the body. The function takes one optional
argument, the object to get the documentation string from. If
no argument is given, the documentation string of the module
is used.
Arguments/return value:
`obj`: The object to get the documentation string from
(default `None`, meaning get the documentation string of the
module)
`returns`: The documentation string of the object
"""
return self._raw_mdoc(obj, _combined)
def _mdoc_title(self, obj: object | None = None) -> str:
R"""The title of the documentation string of an object
A function returning the first line of documentation string,
often considered the title of the documentation string. This
only succeeds if the first line is followed by an empty
line. The function takes one optional argument, the object to
get the documentation string from. If no argument is given,
the documentation string of the module is used.
Arguments/return value:
`obj`: The object to get the documentation string title from
(default `None`, meaning get the documentation string title of
the module)
`returns`: The documentation string title of the object
"""
return self._raw_mdoc(obj, _title)
def _mdoc_body(self, obj: object | None = None) -> str:
R"""The body of the documentation string of an object
The body of the documentation string, meaning the
documentation string except the title (the first line) and the
empty line between the between the title and the body. The
function takes one optional argument, the object to get the
documentation string from. If no argument is given, the
documentation string of the module is used.
Arguments/return value:
`obj`: The object to get the documentation string body from
(default `None`, meaning get the documentation string body of
the module)
`returns`: The documentation string body of the object
"""
return self._raw_mdoc(obj, _body)
def _mk_h(
self,
title: str,
hlevel: int,
hid: str = "",
in_doc: bool = False,
no_toc: bool = False) -> str:
R"""Create a heading and add it to the table of contents
Create a heading at the given level `hlevel` with the given
title. If a table of contents is generated, add an id to the
title and add an entry to the table of contents.
Arguments/return value:
`title`: The title (text) of the heading (section)
`hlevel`: The heading level, where 1 is the highest level
`hid`: Optional heading id that might be modified to ensure
uniqueness (if not given, it will generated if needed)
`in_doc`: Handle a heading in documentation string (default
`False`)
`no_toc`: If `True` do not add to table of contents
(default `False`)
`returns`: A heading typeset to the correct heading level
"""
# If we should add table of contents
if not no_toc:
# Create `hid` if not given
if hid:
hid = _mkid(hid, self.idlist)
else:
hid = _mkid(title, self.idlist)
# If it is inside a documentation string, handle it differently
if in_doc:
tocpart = self.doc_tocpart
else:
tocpart = self.tocpart
# Create header markers
hlevelpre, hlevelpost = self.hmarkers[hlevel]
# Add to table of content
if not no_toc:
if (self.mktoc and hlevel >= self.toc_begin
and hlevel <= self.toc_end):
for t in self.mktoc:
ilevel = hlevel - self.toc_begin
tocpart[t]["items"].append({
"ilevel": ilevel,
"content": f"[{title}](#{hid})"})
# Create and return header
if hid:
idtxt = f'<a id="{hid}"></a>'
else:
idtxt = ""
return f'\n{hlevelpre}{idtxt}{title}{hlevelpost}\n'
def _flush_h(self):
R"""Add the toc part of documentation string to toc
This method appends the temporarly saved table of contents
itmes from a documentation string to the givne (global) table
of contents items and then empties the list of temporarly
saved table of contents itmes.
"""
for t in self.mktoc:
if self.doc_tocpart[t]["items"]:
self.tocpart[t]["items"] += self.doc_tocpart[t]["items"]
self.doc_tocpart[t]["items"] = []
def process_template(
self, template: TextIOBase,
macro_types: str = "block") -> str:
R"""Read and process template
The template file is the driver of the document generation. In
its simplest form, the template is a markdown document that
becomes the module documentation. The template includes some
special commands (MD-macros) used to fetch documentation from
the module (including the documentation strings in the
module).
Arguments/return value:
`template`: The README template file
`macro_types`: Either `"block"` or `"inline"`, where
`"block"` means that the MD-macros are inside HTML comment
blocks and `"inline"` means that MD-macros are directly
inline in the markdown documentation strings (default
`"block"`)
`returns`: The processed markdown README file content
"""
# Start the README file with produced-by/where-to-find comments
mdoc = f"<!-- {produced_by} -->\n"
mdoc += f"<!-- {where_to_find} -->\n"
# We are not in a code block
in_code_block = False
code_block = ""
# We are not in a for-loop in the beginning
in_loop = False
loop_body = ""
# We are not in inline code mode in the beginning
inline = False
# Go through each line
for line in template:
# A code block?
if _md_code[macro_types]["begin"].match(line):
in_code_block = True
continue
# End code block
elif in_code_block and _md_code[macro_types]["end"].match(line):
self._md_code(code_block)
in_code_block = False
code_block = ""
continue
# Inside code block
elif in_code_block:
code_block += line
continue
# A for-loop?
elif (loop_info := _md_forloop[macro_types]["begin"].match(line)):
in_loop = True
in_loop_var_str = loop_info["var"]
in_loop_list_str = loop_info["listexpr"]
continue
# End of for-loop
elif in_loop and _md_forloop[macro_types]["end"].match(line):
mdoc += self._md_for_in(
in_loop_var_str, in_loop_list_str, loop_body)
in_loop = False
loop_body = ""
continue
# Inside for-loop
elif in_loop:
loop_body += line
continue
# Is this inline code in the markdown text?
if _inline_code.match(line):
if inline:
inline = False
else:
inline = True
mdoc += line
continue
elif inline:
mdoc += line
continue
# Is this a command
mcmd = _md_macro[macro_types].match(line)
# Yes, a command
if mcmd:
# Process the found MD-macro
res = self.process_macro(mcmd["macro"], mcmd["args"])
# If it produce text, add it to the documentation
if res:
mdoc += res
# No
else:
# Just save the documentation line
mdoc += line
# Add toc
if self.tocpart:
# Go through every toc (we can have more than one)
for t in self.tocpart:
# Might need this to adjust indent
min_i = min([i["ilevel"] for i in self.tocpart[t]["items"]])
# For each text item in the current toc
toc = []
# Each toc text item starts with this
start = self.tocpart[t]["toc_item_start"]
# Each item in the toc
for item in self.tocpart[t]["items"]:
# Calculate the indent size and make the indentation
indentsize = \
self.tocpart[t]["toc_item_indent"] * \
(item["ilevel"] - min_i)
indent = " " * indentsize
# Add the text item
toc.append(f'{indent}{start}{item["content"]}')
# Insert the toc in the documentation string
mdoc = mdoc.replace(
f"%({t})s",
self.tocpart[t]["toc_item_end"].join(toc))
# Return doc
return mdoc
def process_macro(self, macro_name: str, args_str: str) -> str:
R"""Process a MD-macro
Process a MD-macro with the given name and arguments.
Arguments/return value:
`macro_name`: MD-macro name
`args_str`: the arguments to the MD-macro as a string
`returns`: returns the documentation part generated by the MD-macro
"""
# Get MD-macro
full_name = "_md_" + macro_name
if macro_name in self.md_macros and hasattr(self, full_name):
macro = getattr(self, full_name)
else:
raise PyPiMDocError(f"Unknown MD-macro: {macro_name}")
# Get the arguments
_args_kw = lambda *args, **kw: (args, kw)
args, kw = eval(
f"_args_kw({args_str})",
globals = self.mgns, locals = self.mlns | {"_args_kw": _args_kw})
# Perform the macro
return macro(*args, **kw)
def _md_h(self, title: str, hlevel: int,
hid: str = "", no_toc: bool = False) -> str:
R"""Insert a heading
Insert a heading at the given level (including adjustment
from base level).
Arguments/return value:
`title`: A title
`hlevel`: The heading level for the title
`hid`: An id for the title that is used to be able to link to
it (default empty, meaning it will be generated from the title)
`no_toc`: Set this to `True` if the heading should not be
included in the table of contents (default `False`)
`returns`: The formatted heading
"""
# Level is relative
hlevel = self._get_real_hlevel(hlevel)
# Create and return the header
return self._mk_h(title, hlevel, hid, no_toc)
def _md_doc(
self,
obj: object | str | list | None = None,
name: str = "",
title: str = "",
hid: str = "",
hlevel: int = 0,
sig: str = "",
init: bool = False,
complete: bool | list = False,
init_title: str = "Initialize",
skip_firstline: bool = False,
doc_headings: bool = True,
name_transform: Callable = lambda n: n,
title_transform: Callable = lambda n: n) -> str:
R"""Insert the documentation of the given object
Returns the documentation for the given object (class, method,
function). If no object is given, the documentation of the
module is returned.
Arguments/return value:
`obj`: The object (function, method, class) to prepare and
return the documentation for. If `obj` is a list, the
documentation for all objects in the list are prepared and
returned (in separate paragraphs). If no object is given, the
documentation for the module is prepared and returned
(optional).
`name`: The name of the object (optinal; we can find it)
`title`: A title for the documentation if the heading is
generated (optional; we will generate a proper title if
`hlevel` is higher than zero and no title is given)
`hid`: An id for the title that is used to be able to link to
it (optional; will be genrated if needed and not given)
`hlevel`: The heading level, cf. HTML h tag level (default 0,
meaning no heading generated)
`sig`: A signature can be provided for methods/functions, but
this is usualy not needed since the MD-method is able to
generate this from the method/function (default `""`)
`init`: Include the documentation and signature of the
`__init__` method in the documentation of the object if the
object is a class and has an `__init__` method (default
`False`)
`complete`: If the objetc is a class, include the
documentation for the class, its constructor (the `__init__`
method) and all non-hidden methods, when complete is `True`,
or the listed methods, when complete is a list of methods
(default `False`)
`init_title`: If `complete` is set (`True` or a list) and the
objetc is a class, use this combined with the class name as
the title for the constructor (the `__init__` method)
`skip_firstline`: The first line of the documentation string
might have a specific meaning, like a title or a sub-title,
and sometimes we might want to skip this part in the generated
documentation.
`doc_headings`: if `True`, detect and handle headings in the
documentation string, otherwise do nothing (default `True`)
`name_transform`: a function that takes a text string as an
argument and returns a text string; the function can be used
to transform the (found) name
`title_transform`: a function that takes a text string as an
argument and returns a text string; the function can be used
to transform the (found) title
`returns`: The documentation of the given object (or the module)
"""
# The documentation of the module attribute
adoc = ""
# The special case, if `obj` is a list
if type(obj) is list:
# For each object, get the documentation
for an_obj in obj:
adoc += f"\n{self._md_doc(an_obj, hlevel=hlevel).strip()}\n"
# Return the combined documentation
return adoc
# Level is relative
org_hlevel = hlevel
hlevel = self._get_real_hlevel(hlevel)
# Get the object (attribute)
if obj:
if type(obj) is str:
attr = _get_nested_attr(self.module, obj)
else:
attr = obj
if not name:
name = name_transform(attr.__qualname__)
else:
attr = self.module
if not name:
name = name_transform(self.name)
# Get documentation string
# First line often have a special meaning (title or sub-title)
firstline, raw_doc = self._get_title_and_doc(attr)
# Should we detect and handle headings in documentation strings?
if raw_doc and doc_headings:
# Initialize variables
doc = "" # The handled documentation string
inline_code = False # Inside an inline code block
# Go through each line of the documentation string
for line in raw_doc.splitlines():
# Is the line the start or end of an inline code
# block? And if so, toggle the `inline_code` flag
if _inline_code.match(line):
inline_code = True if inline_code is False else False
# If inside an inline code block
if inline_code:
# Inline code block content is unmodified
doc += f"{line}\n"
# If not inside an inline code block
else:
# Is this a heading
heading = _md_heading.match(line)
if heading:
# Create a heading
level = self._get_real_hlevel(len(heading["level"]))
doc += self._mk_h(
heading["title"], level, in_doc = True)
# Otherwise, just add the unmodified line
else:
doc += f"{line}\n"
# Do nothing with the raw documentation string
else:
doc = raw_doc
# If `hlevel` < 1 and a title, we don't need `hid` (and levelmarkers)
# If `hlevel` >= 1, a title (and maybe `hid`) has to be is added
if hlevel > 0 and not title:
# Create `title` (and `hid`) from attribute
for (ttype, ttest) in [
("Class", isclass),
("Method", _ismethod),
("Function", isfunction),
("Module", ismodule)]:
if ttest(attr):
title = f"{ttype} `{name}`"
if hid:
hid = _mkid(hid, self.idlist)
else:
hid = _mkid(f"{name.lower()}", self.idlist)
break
# Not able to create `title`, use first line as `title`
else:
# Get first line from `doc` and make it `title` + make `hid`
if firstline:
title = firstline
firstline = None
if hid:
hid = _mkid(hid, self.idlist)
else:
hid = _mkid(title, self.idlist)
else:
raise PyPiMDocError("Unable to find title for doc string")
# Get signature of method
if not sig and isfunction(attr):
sig = _signature(attr)
# Signture of class from `__init__` (and its documentation string)
elif isclass(attr) and hasattr(attr, "__init__") and init:
fline, init_doc = self._get_title_and_doc(attr.__init__)
if fline:
init_doc = f"**{fline}**\n\n{init_doc}"
if init_doc:
doc += f"\n\n{init_doc}"
if not sig:
sig = _signature(attr.__init__)
# Add the title to the module doc
if title:
adoc += self._mk_h(title_transform(title), hlevel, hid)
# Flush documentation string headings to table of content list
self._flush_h()
# Add signature
if sig:
if "." in name:
fname = name.split(".")[-1]
else:
fname = name
adoc += f"\n```python\n{fname}{sig}\n```\n"
# Arguments/Returns headers in the documentation string
doc = _margsheader.sub(r'\n\n{{\1}}\n\n', doc, re.MULTILINE)
doc = doc.replace("{{", "**").replace("}}", "**")
# Complete class, including methods
if complete and isclass(attr):
# Include the constructor (`__init__`) if implemented
if (not init and hasattr(attr, "__init__")
and _ismethod(attr.__init__)):
method_kw_list = [
{"obj": attr.__init__,
"name": name,
"title": f"{init_title} `{name}`"}]
else:
method_kw_list = []
# The methods are listed
if type(complete) is list:
method_kw_list += [{"obj": m} for m in complete]
# If complete is not a list, find all public methods
else:
for n in dir(attr):
m = getattr(attr, n)
if _ismethod(m) and m.__name__[0] != "_":
method_kw_list.append({"obj": m})
# Add the documentation for the methods of the class
for kw in method_kw_list:
kw["hlevel"] = org_hlevel + 1 if org_hlevel > 0 else 0
kw["name_transform"] = name_transform
doc += "\n\n" + self._md_doc(**kw)
# Add the documentation string to the module doc
if firstline and not skip_firstline:
doc = f"*{firstline}*\n\n{doc}"
adoc += f"\n{doc}\n"
# Return the documentation of the object (or module)
return adoc
def _md_toc(self, name: str = "toc", btoc: bool = True,
toc_item_start: str = " - ", toc_item_end: str = "\n",
toc_item_indent: int = 4) -> str:
R"""Insert a table of contents
Insert a table of contents with all headings following this
MD-macro until the end of document or until a matching `etoc`
MD-macro. If the `btoc` argument is `False`, the table of
contents will be inserted here but items (headings) for the
table of contents will not be registered yet. You then need to
insert a `btoc` MD-macro in the README template to start
collcting items for the table of contents.
Is is also possible to have different sets of table of
contents. To do this, give each set a unique name (the
default name is `"toc"`).
Arguments/return value:
`name`: The name of this specific table of contents; only
needed if you have different sets og groups of table of
contents in the README template (optional, default `"toc"`)
`btoc`: If `False`, do not start to collect items for the
table of contents here (default `True`)
`toc_item_start`: The text string preceeding every item in the
table of contents (default `" - "`)
`toc_item_end`: The text string following every item in the
table of contents (default `"\n"`)
`toc_item_indent`: (default 4)
`returns`: The formatted version of the table of contents
"""
# Start collecting items to table of contents (with the given name)
if btoc:
self._md_btoc(name)
# The datastructure for this table of contents
self.tocpart[name] = {
"items" : [],
"toc_item_start": toc_item_start,
"toc_item_end": toc_item_end,
"toc_item_indent": toc_item_indent
}
# For items inside documentation strings
self.doc_tocpart[name] = {
"items" : [],
"toc_item_start": toc_item_start,
"toc_item_end": toc_item_end,
"toc_item_indent": toc_item_indent
}
# Return a placeholder for the table of contents
return f"%({name})s"
def _md_btoc(self, name: str = "toc"):
R"""Start to collect items to table of contents
Start to collect items to table of contents (with the given
name). From now on and until the matching `etco` MD-macro or
the end of the file, every heading will be added as an item to
the table of contents (with the exceptions of headings marked
not to be added to table of contents).
Arguments:
`name`: The name of this specific table of contents; only
needed if you have different sets og groups of table of
contents in the README template (optional, default `"toc"`)
"""
self.mktoc.add(name)
def _md_etoc(self, name: str = "toc"):
R"""Stop collecting items to table of contents
Stop collecting items to table of contents (with the given
name).
Arguments:
`name`: The name of this specific table of contents; only
needed if you have different sets og groups of table of
contents in the README template (optional, default `"toc"`)
"""
self.mktoc.discard(name)
def _md_for_in(self, loop_var: str, loop_list: str, loop_body: str) -> str:
R"""Loop through a list of documentation elements
Loop documentation
"""
mdoc = ""
save_mlns = self.mlns.copy()
for x in eval(loop_list, globals=self.mgns, locals=self.mlns):
self.mlns[loop_var] = x
mdoc += self.process_template(StringIO(loop_body), "inline")
self.mlns = save_mlns
return mdoc
def _md_eval(self, code: str) -> str:
R"""Insert the text output of the Python code
Insert the text output of the Python code evaluated in the
name space of the module and the MD-macros’ local name space.
Arguments/return value:
`code`: The Python code to evaluate
`returns`: The resulting text
"""
return eval(code, globals=self.mgns, locals=self.mlns)
def _md_code(self, code: str):
R"""Execute the code
Execute the code to populate the MD-macros’ local name space
that later can be used in MD-macros arguments and in the code
of the MD-macro `eval`.
Arguments:
"""
exec(code, globals=self.mgns, locals=self.mlns)
def _md_cmd(self, cmd: str) -> str:
R"""Insert the text output of the command
Insert the text output of the (shell) command.
Arguments/return value:
`cmd`: The shell command
`returns`: The output of the command
"""
cmdl = cmd.split()
res = subprocess.run(cmdl, text=True, capture_output=True)
if res.returncode != 0:
raise PyPiMDocError(f"Command failed: {cmd}")
else:
return res.stdout.strip()
def _md_cmd_cb(self, cmd: str) -> str:
R"""Insert the text output of the command as a code block
Insert the text output of the (shell) command as a code block.
Arguments/return value:
`cmd`: The shell command
`returns`: The output of the command in a code block
"""
return f"```\n{self._md_cmd(cmd)}\n```\n"
def _md_help(self, cmd: str = "", sub_cmd: str = "",
title: str = "", hlevel: int = 0, hid: str = '',
no_toc: bool = False) -> str:
R"""Insert the output from a help command
Insert the output from a help command reformatted as markdown.
The output of the help command is expected to be formated as
the Python module `argparse` formats the help text.
Arguments/return value:
`cmd`: The help command (default empty, meaning execute the
current moudule's file module with the command line argument
`"-h"`)
`sub_cmd`: The sub-command (default empty, meaning the help
message of the main command)
`title`: The title used in the heading (create a default title
if this is not provided)
`hlevel`: The heading level for the title (default 0, meaning
no heading)
`hid`: An id for the title that is used to be able to link to
it (default empty, meaning it will be generated from the
title)
`no_toc`: Set this to `True` if the heading should not be
included in the table of contents (default `False`)
`returns`: The heading and output of the help command formated
"""
# A sub command?
if sub_cmd:
sub = f" {sub_cmd}"
else:
sub = ""
# Make the heading (with title), if needed
if hlevel > 0:
if not title:
title = f"Command `{self.name}{sub}`"
heading = self._md_h(title, hlevel, hid, no_toc) + "\n\n"
else:
heading = ""
# Get the help text
if not cmd:
cmd = f"{sys.executable} {self.filename}{sub} -h"
# Get help text and convert it to markdown
help_txt = self._md_cmd(cmd)
md_txt = help2md(help_txt)
# Return heading and help text
return heading + md_txt
#
# More help functions
#
def _list_md_macros(
cls: object = PyPiMDoc,
pre: str = "_md_",
rm_pre: bool = False,
qualname: bool = True,
sort_order: list | None = [
"^h$", "([be])?doc", "([be])?toc", "eval", "([lbe])?code",
"[l]?cmd", "help"
]) -> list:
R"""List all the MD-macros
List all the MD-macros if the given object or class.
Arguments/return value:
`cls`: Class or object to list the MD-macros from (default `PyPiMDoc`)
`pre`: The pre-string of all MD-macros (default `"_md_"`)
`rm_pre`: Remove the pre-string from the name of ech macro in the
list (default `False`)
`qualname`: Use the fully qualified name of the macros (include
the class name, default `True`)
`sort_order`: A list of regular expressions specifing the sort
order in the returned list of macro names; if this is `None` no
extra sorting is done (for the regular expression with a group,
names with an empty group are put in front of the other ones
matching the same regular expressions; see the default value in
the method definition)
`returns`: The list of MD-macro names
"""
# Find the MD-macros (with the names starting with `pre`)
psize = len(pre)
mdm = [m for m in dir(cls) if m[:psize] == pre]
# Use the fully qualified name?
if qualname:
mdm = [getattr(cls, m).__qualname__ for m in mdm]
# Or remove the first part of the name, the `pre` string
elif rm_pre:
mdm = [m[psize:] for m in mdm]
# Should the macro names be sorted in a specific order?
if sort_order:
# Do the sorting by groups (the sort order groups)
mdm_sort = {so: [] for so in sort_order}
rest = []
for m in mdm:
for so in sort_order:
ma = m.split(".")[-1]
if ma[:psize] == pre:
ma = ma[psize:]
if me:=re.match(so, ma):
if me.groups() and not me.group(1):
mdm_sort[so] = [m] + mdm_sort[so]
else:
mdm_sort[so].append(m)
break
else:
rest.append(m)
# Morge the groups to a single sorted list
mdm = []
for so in sort_order:
mdm += mdm_sort[so]
mdm += rest
# Return a list of the (sorted) macro names
return mdm
#
# The rest of the code is to run the module as an interactive command
#
# Execute this module as a program
def main():
# Formatters
formatters = ["markdown", "html", "latex"]
# Create overall argument parser
import argparse
parser = argparse.ArgumentParser(
description=__doc__.splitlines()[0])
parser.add_argument(
"pysrc", metavar="PYSRC",
help="module source code file")
parser.add_argument(
"-V", "--version", action="version",
version=f"%(prog)s " + version)
parser.add_argument(
"-t", "--template",
type=argparse.FileType("r"),
help="markdown template (default 'README.template')")
parser.add_argument(
"-o", "--outfile", default=sys.stdout, type=argparse.FileType("w"),
help="output file (default stdout)")
#parser.add_argument(
# "-f", "--formatter", default=None, choices=formatters,
# help="formatter to use (default guessed by filename or 'markdown')")
#parser.add_argument(
# "-s", "--style", default="emacs",
# help="style (default 'emacs')")
parser.add_argument(
"-l", "--base-heading-level", default=1, type=int,
help="base (start) level of headings " + \
"(default 1, like '<h1></h1>' in HTML)")
parser.add_argument(
"-i", "--inline-md-macros", action="store_true",
help="MD-macros are inline in the markdown template " + \
"(and not inside HTML-comments)")
parser.add_argument(
"-n", "--name", default=None,
help="name of module (default source code filename without '.py')")
# Parse arguments
args = parser.parse_args()
# Choose formatter (html or latex)
# if args.formatter:
# if args.formatter == "html":
# formatter = HtmlFormatter()
# elif args.formatter == "latex":
# formatter = LatexFormatter()
# else:
# formatter = None
# else:
# if Path(args.outfile.name).suffix in [".html", ".htm"]:
# formatter = HtmlFormatter()
# elif Path(args.outfile.name).suffix in [".ltx", ".tex", ".latex"]:
# formatter = LatexFormatter()
# else:
# formatter = None
# Choose style
#try:
# style = get_style_by_name(args.style)
#except ClassNotFound:
# print(f"{sys.argv[0]}: unknown style {args.style}", file=sys.stderr)
# sys.exit(1)
# MD-macros inline or in HTML comment blocks
if args.inline_md_macros:
macro_types = "inline"
else:
macro_types = "block"
# Create `PyPiMDoc` instance and create the documentation
pypimdoc = PyPiMDoc(args.pysrc, base_heading_level=args.base_heading_level)
md = pypimdoc.process_template(args.template, macro_types)
print(md, file=args.outfile)
# execute this module as a program
if __name__ == '__main__':
main()