py-gfm documentation

Build status Coverage status Documentation status

This is an implementation of GitHub-Flavored Markdown written as an extension to the Python Markdown library. It aims for maximal compatibility with GitHub’s rendering.

py-gfm code is under a BSD-style license.

Installation

pip install py-gfm

Quick start

All-in-one extension

import markdown
from mdx_gfm import GithubFlavoredMarkdownExtension

source = """
Hello, *world*! This is a ~~good~~marvelous day!
Here is an auto link: https://example.org/

Le me introduce you to [task lists](https://github.com/blog/1375-task-lists-in-gfm-issues-pulls-comments):

- [ ] eggs
- [x] milk

You can also have fenced code blocks:

```python
import this
```
"""

# Direct conversion:
html = markdown.markdown(
    source, extensions=[GithubFlavoredMarkdownExtension()])

# Factory-like:
md = markdown.Markdown(extensions=[GithubFlavoredMarkdownExtension()])
html = md.convert(source)

# By module name (not recommended if you need custom configs):
html = markdown.markdown(source, extensions=['mdx_gfm'])

À la carte

import markdown
from gfm import AutolinkExtension, TaskListExtension

html = markdown.markdown(
    source, extensions=[AutolinkExtension(),
                        TaskListExtension(max_depth=2)])

Available extensions

gfm.semi_sane_lists – GitHub-like list parsing

The gfm.semi_sane_lists module provides an extension that causes lists to be treated the same way GitHub does.

Like the sane_lists extension, GitHub considers a list to end if it’s separated by multiple newlines from another list of a different type. Unlike the sane_lists extension, GitHub will mix list types if they’re not separated by multiple newlines.

GitHub also recognizes lists that start in the middle of a paragraph. This is currently not supported by this extension, since the Python parser has a deeply-ingrained belief that blocks are always separated by multiple newlines.

Typical usage

import markdown
from gfm import SemiSaneListExtension

print(markdown.markdown("""
- eggs
- milk

1. mix
2. stew
""", extensions=[SemiSaneListExtension()]))
<ul>
<li>eggs</li>
<li>milk</li>
</ul>
<ol>
<li>mix</li>
<li>stew</li>
</ol>
class gfm.semi_sane_lists.SemiSaneListExtension(**kwargs)[source]

Bases: markdown.extensions.Extension

An extension that causes lists to be treated the same way GitHub does.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
getConfig(key, default='')

Return a setting for the given key or an empty string.

getConfigInfo()

Return all config descriptions as a list of tuples.

getConfigs()

Return all configs settings as a dict.

setConfig(key, value)

Set a config setting for key with the given value.

setConfigs(items)

Set multiple config settings given a dict or list of tuples.

class gfm.semi_sane_lists.SemiSaneOListProcessor(parser)[source]

Bases: markdown.blockprocessors.OListProcessor

detab(text)

Remove a tab from the front of each line of the given text.

get_items(block)

Break a block into list items.

lastChild(parent)

Return the last child of an etree element.

looseDetab(text, level=1)

Remove a tab from front of lines but allowing dedented lines.

run(parent, blocks)

Run processor. Must be overridden by subclasses.

When the parser determines the appropriate type of a block, the parser will call the corresponding processor’s run method. This method should parse the individual lines of the block and append them to the etree.

Note that both the parent and etree keywords are pointers to instances of the objects which should be edited in place. Each processor must make changes to the existing objects as there is no mechanism to return new/different objects to replace them.

This means that this method should be adding SubElements or adding text to the parent, and should remove (pop) or add (insert) items to the list of blocks.

Keywords:

  • parent: A etree element which is the parent of the current block.
  • blocks: A list of all remaining blocks of the document.
test(parent, block)

Test for block type. Must be overridden by subclasses.

As the parser loops through processors, it will call the test method on each to determine if the given block of text is of that type. This method must return a boolean True or False. The actual method of testing is left to the needs of that particular block type. It could be as simple as block.startswith(some_string) or a complex regular expression. As the block type may be different depending on the parent of the block (i.e. inside a list), the parent etree element is also provided and may be used as part of the test.

Keywords:

  • parent: A etree element which will be the parent of the block.
  • block: A block of text from the source which has been split at
    blank lines.
class gfm.semi_sane_lists.SemiSaneUListProcessor(parser)[source]

Bases: markdown.blockprocessors.UListProcessor

detab(text)

Remove a tab from the front of each line of the given text.

get_items(block)

Break a block into list items.

lastChild(parent)

Return the last child of an etree element.

looseDetab(text, level=1)

Remove a tab from front of lines but allowing dedented lines.

run(parent, blocks)

Run processor. Must be overridden by subclasses.

When the parser determines the appropriate type of a block, the parser will call the corresponding processor’s run method. This method should parse the individual lines of the block and append them to the etree.

Note that both the parent and etree keywords are pointers to instances of the objects which should be edited in place. Each processor must make changes to the existing objects as there is no mechanism to return new/different objects to replace them.

This means that this method should be adding SubElements or adding text to the parent, and should remove (pop) or add (insert) items to the list of blocks.

Keywords:

  • parent: A etree element which is the parent of the current block.
  • blocks: A list of all remaining blocks of the document.
test(parent, block)

Test for block type. Must be overridden by subclasses.

As the parser loops through processors, it will call the test method on each to determine if the given block of text is of that type. This method must return a boolean True or False. The actual method of testing is left to the needs of that particular block type. It could be as simple as block.startswith(some_string) or a complex regular expression. As the block type may be different depending on the parent of the block (i.e. inside a list), the parent etree element is also provided and may be used as part of the test.

Keywords:

  • parent: A etree element which will be the parent of the block.
  • block: A block of text from the source which has been split at
    blank lines.

gfm.strikethrough – Strike-through support

The gfm.strikethrough module provides GitHub-like syntax for strike-through text, that is text between double tildes: some ~~strike-through'ed~~ text

Typical usage

import markdown
from gfm import StrikethroughExtension

print(markdown.markdown("I ~~like~~ love you!",
                        extensions=[StrikethroughExtension()]))
<p>I <del>like</del> love you!</p>
class gfm.strikethrough.StrikethroughExtension(**kwargs)[source]

Bases: markdown.extensions.Extension

An extension that adds support for strike-through text between two ~~.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
getConfig(key, default='')

Return a setting for the given key or an empty string.

getConfigInfo()

Return all config descriptions as a list of tuples.

getConfigs()

Return all configs settings as a dict.

setConfig(key, value)

Set a config setting for key with the given value.

setConfigs(items)

Set multiple config settings given a dict or list of tuples.

gfm.tasklist – Task list support

The gfm.tasklist module provides GitHub-like support for task lists. Those are normal lists with a checkbox-like syntax at the beginning of items that will be converted to actual checkbox inputs. Nested lists are supported.

Example syntax:

- [x] milk
- [ ] eggs
- [x] chocolate
- [ ] if possible:
    1. [ ] solve world peace
    2. [ ] solve world hunger

Note

GitHub has support for updating the Markdown source text by toggling the checkbox (by clicking on it). This is not supported by this extension, as it requires server-side processing that is out of scope of a Markdown parser.

Available configuration options

Name Type Default Description
unordered bool True Set to False to disable parsing of unordered lists.
ordered bool True Set to False to disable parsing of ordered lists.
max_depth integer Set to a positive integer to stop parsing nested task lists that are deeper than this limit.
list_attrs dict, callable {} Attributes to be added to the <ul> or <ol> element containing the items.
item_attrs dict, callable {} Attributes to be added to the <li> element containing the checkbox. See Item attributes.
checkbox_attrs dict, callable {} Attributes to be added to the checkbox element. See Checkbox attributes.
List attributes

If option list_attrs is a dict, the key-value pairs will be applied to the <ul> (resp. <ol>) unordered (resp. ordered) list element, that is the parent element of the <li> elements.

Warning

These attributes are applied to all nesting levels of lists, that is, to both the root lists and their potential sub-lists, recursively.

You can control this behavior by using a callable instead (see below).

If option list_attrs is a callable, it should be a function that respects the following prototype:

def function(list, depth: int) -> dict:

where:

  • list is the <ul> or <ol> element;
  • depth is the depth of this list relative to its root list (root lists have a depth of 1).

The returned dict items will be applied as HTML attributes to the list element.

Note

Thanks to this feature, you could apply attributes to root lists only. Take this code sample:

import markdown
from gfm import TaskListExtension

def list_attr_cb(list, depth):
    if depth == 1:
        return {'class': 'tasklist'}
    return {}

tl_ext = TaskListExtension(list_attrs=list_attr_cb)

print(markdown.markdown("""
- [x] some thing
- [ ] some other
    - [ ] sub thing
    - [ ] sub other
""", extensions=[tl_ext]))

In this example, only the root list will have the tasklist class attribute, not the one containing “sub” items.

Item attributes

If option item_attrs is a dict, the key-value pairs will be applied to the <li> element as its HTML attributes.

Example:

item_attrs={'class': 'list-item'}

will result in:

<li class="list-item">...</li>

If option item_attrs is a callable, it should be a function that respects the following prototype:

def function(parent, element, checkbox) -> dict:

where:

  • parent is the <li> parent element;
  • element is the <li> element;
  • checkbox is the generated <input type="checkbox"> element.

The returned dict items will be applied as HTML attributes to the <li> element containing the checkbox.

Checkbox attributes

If option checkbox_attrs is a dict, the key-value pairs will be applied to the <input type="checkbox"> element as its HTML attributes.

Example:

checkbox_attrs={'class': 'list-cb'}

will result in:

<li><input type="checkbox" class="list-cb"> ...</li>

If option checkbox_attrs is a callable, it should be a function that respects the following prototype:

def function(parent, element) -> dict:

where:

  • parent is the <li> parent element;
  • element is the <li> element.

The returned dict items will be applied as HTML attributes to the checkbox element.

Typical usage

import markdown
from gfm import TaskListExtension

print(markdown.markdown("""
- [x] milk
- [ ] eggs
- [x] chocolate
- not a checkbox
""", extensions=[TaskListExtension()]))
<ul>
<li><input checked="checked" disabled="disabled" type="checkbox" /> milk</li>
<li><input disabled="disabled" type="checkbox" /> eggs</li>
<li><input checked="checked" disabled="disabled" type="checkbox" /> chocolate</li>
<li>not a checkbox</li>
</ul>
class gfm.tasklist.TaskListExtension(**kwargs)[source]

Bases: markdown.extensions.Extension

An extension that supports GitHub task lists. Both ordered and unordered lists are supported and can be separately enabled. Nested lists are supported.

Example:

- [x] milk
- [ ] eggs
- [x] chocolate
- [ ] if possible:
    1. [ ] solve world peace
    2. [ ] solve world hunger
extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
getConfig(key, default='')

Return a setting for the given key or an empty string.

getConfigInfo()

Return all config descriptions as a list of tuples.

getConfigs()

Return all configs settings as a dict.

setConfig(key, value)

Set a config setting for key with the given value.

setConfigs(items)

Set multiple config settings given a dict or list of tuples.

class gfm.tasklist.TaskListProcessor(ext)[source]

Bases: markdown.treeprocessors.Treeprocessor

run(root)[source]

Subclasses of Treeprocessor should implement a run method, which takes a root ElementTree. This method can return another ElementTree object, and the existing root ElementTree will be replaced, or it can modify the current tree and return None.

Modules

gfm – Base module for GitHub-Flavored Markdown

class gfm.AutolinkExtension(**kwargs)[source]

An extension that turns URLs into links.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
class gfm.AutomailExtension(**kwargs)[source]

An extension that turns email addresses into links.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
class gfm.SemiSaneListExtension(**kwargs)[source]

An extension that causes lists to be treated the same way GitHub does.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
class gfm.StandaloneFencedCodeExtension(**kwargs)[source]
extendMarkdown(md)[source]

Add FencedBlockPreprocessor to the Markdown instance.

class gfm.StrikethroughExtension(**kwargs)[source]

An extension that adds support for strike-through text between two ~~.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
class gfm.TaskListExtension(**kwargs)[source]

An extension that supports GitHub task lists. Both ordered and unordered lists are supported and can be separately enabled. Nested lists are supported.

Example:

- [x] milk
- [ ] eggs
- [x] chocolate
- [ ] if possible:
    1. [ ] solve world peace
    2. [ ] solve world hunger
extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.

mdx_gfm – Full extension for GFM (comments, issues)

An extension that is as compatible as possible with GitHub-flavored Markdown (GFM).

This extension aims to be compatible with the standard GFM that GitHub uses for comments and issues. It has all the extensions described in the GFM documentation, except for intra-GitHub links to commits, repositories, and issues.

Note that Markdown-formatted gists and files (including READMEs) on GitHub use a slightly different variant of GFM. For that, use mdx_partial_gfm.PartialGithubFlavoredMarkdownExtension.

class mdx_gfm.GithubFlavoredMarkdownExtension(**kwargs)[source]

Bases: mdx_partial_gfm.PartialGithubFlavoredMarkdownExtension

An extension that is as compatible as possible with GitHub-flavored Markdown (GFM).

This extension aims to be compatible with the standard GFM that GitHub uses for comments and issues. It has all the extensions described in the GFM documentation, except for intra-GitHub links to commits, repositories, and issues.

Note that Markdown-formatted gists and files (including READMEs) on GitHub use a slightly different variant of GFM. For that, use mdx_partial_gfm.PartialGithubFlavoredMarkdownExtension.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
getConfig(key, default='')

Return a setting for the given key or an empty string.

getConfigInfo()

Return all config descriptions as a list of tuples.

getConfigs()

Return all configs settings as a dict.

setConfig(key, value)

Set a config setting for key with the given value.

setConfigs(items)

Set multiple config settings given a dict or list of tuples.

mdx_partial_gfm – Partial extension for GFM (READMEs, wiki)

An extension that is as compatible as possible with GitHub-flavored Markdown (GFM).

This extension aims to be compatible with the variant of GFM that GitHub uses for Markdown-formatted gists and files (including READMEs). This variant seems to have all the extensions described in the GFM documentation, except:

  • Newlines in paragraphs are not transformed into br tags.
  • Intra-GitHub links to commits, repositories, and issues are not supported.

If you need support for features specific to GitHub comments and issues, please use mdx_gfm.GithubFlavoredMarkdownExtension.

class mdx_partial_gfm.PartialGithubFlavoredMarkdownExtension(**kwargs)[source]

Bases: markdown.extensions.Extension

An extension that is as compatible as possible with GitHub-flavored Markdown (GFM).

This extension aims to be compatible with the variant of GFM that GitHub uses for Markdown-formatted gists and files (including READMEs). This variant seems to have all the extensions described in the GFM documentation, except:

  • Newlines in paragraphs are not transformed into br tags.
  • Intra-GitHub links to commits, repositories, and issues are not supported.

If you need support for features specific to GitHub comments and issues, please use mdx_gfm.GithubFlavoredMarkdownExtension.

extendMarkdown(md)[source]

Add the various proccesors and patterns to the Markdown Instance.

This method must be overriden by every extension.

Keyword arguments:

  • md: The Markdown instance.
  • md_globals: Global variables in the markdown module namespace.
getConfig(key, default='')

Return a setting for the given key or an empty string.

getConfigInfo()

Return all config descriptions as a list of tuples.

getConfigs()

Return all configs settings as a dict.

setConfig(key, value)

Set a config setting for key with the given value.

setConfigs(items)

Set multiple config settings given a dict or list of tuples.

Supported features

  • Fenced code blocks
  • Literal line breaks
  • Tables
  • Hyperlink parsing (http, https, ftp, email and www subdomains)
  • Code highlighting for code blocks if Pygments is available
  • Mixed-style lists with no separation
  • Strikethrough
  • Task lists

Unsupported features

This implementation does not support all of GFM features and has known differences in how rendering is done.

  • By design, link to commits, issues, pull requests and user profiles are not supported since this is application specific. Feel free to subclass the provided classes to implement your own logic.
  • There is no emoji support.
  • There is no horizontal rule (--- ie. <hr>) support.
  • Nested lists are not behaving exactly like GitHub’s: issue #10.
  • Contrary to GitHub, only double-tilde’d text renders strikethrough, not single-tile’d: issue #14.

Indices and tables