Working with Markdown in Python

Markdown makes it easy to add syntax to your plain text documents for readability and machine parsing. In this article, learn how to work with markdown in Python using the python-markdown package.

If you use the Internet, you have surely come across the term Markdown. Markdown is a lightweight markup language that makes it very easy to write formatted content. It was created by John Gruber and Aaron Swartz in 2004. It uses very easy-to-remember syntax and is therefore used by many bloggers and content writers around the world. Even this blog that you are reading is written and formatted using Markdown.

Markdown is one of the most widely used formats for storing formatted data. It easily integrates with Web technologies, as it can be converted to HTML or vice versa using Markdown compilers. It allows you to write HTML entities, such as headings, lists, images, links, tables, and more without much effort or code. It is used in blogs, content management systems, Wikis, documentation, and many more places.

In this article, you'll learn how to work with Markdown in a Python application using different Python packages, including markdown, front matter, and markdownify.

Prerequisites

To follow along with this tutorial, you’ll need the following:

  • Python v3.x
  • Basic understanding of HTML and Markdown

Setting Up a Project

Before proceeding with the project, you’ll need to set up a project directory to work in.

So, first, open up your terminal, navigate to a path of your choice, and create a project directory (python-markdown) by running the following commands in the terminal:

mkdir python-markdown
cd python-markdown

Finally, create and activate the virtual environment (venv) for your Python project by running the following commands:

python3 -m venv
source venv/bin/activate

That’s it. The project setup is complete.

Converting Markdown to HTML in Python

One of the most common operations related to Markdown is converting it to HTML. By doing so, you can write your content in Markdown and then compile it to HTML, which you can then deploy to a CDN or server.

First, install the python-markdown package by running the following command in the terminal:

pip install markdown

Next, at your project’s root directory, create a main.py file and add the following code to it:

# 1
import markdown

markdown_string = '# Hello World'

# 2
html_string = markdown.markdown(markdown_string)
print(html_string)

In the above code, you are doing the following:

  1. Importing the markdown module.
  2. Converting the markdown (markdown_string) to HTML (html_string) using the markdown method from the markdown package.

Finally, save your code and run the main.py file by running the following command in the terminal:

python main.py

Once the code execution is complete, you’ll get the HTML output as follows:

Markdown to HTML.

You can try a more complex Markdown string like the one in the code below and use it to create HTML:

markdown_string = '''
# Hello World

This is a **great** tutorial about using Markdown in [Python](https://python.org).
'''

In this example, you make use of headings, bold text, and links in Markdown.

Markdown to HTML.

Converting a Markdown File to HTML in Python

Most of the time, you’ll be working with Markdown files rather than Markdown strings. Therefore, it makes sense to learn how to convert a Markdown file to an HTML file.

To do so, first, create a sample.md file and add the following code to it:

# Hello World

This is a **Markdown** file.

Next, replace the existing code in the main.py file with the following:

import markdown

# 1
with open('sample.md', 'r') as f:
    markdown_string = f.read()

# 2
html_string = markdown.markdown(markdown_string)

# 3
with open('sample.html', 'w') as f:
    f.write(html_string)

In the above code, you are doing the following:

  1. Reading the sample.md and storing its content in the markdown_string variable.
  2. Converting the markdown (markdown_string) to HTML (html_string) using the markdown method from the markdown package.
  3. Creating a sample.html file and writing the HTML (html_string) to it.

Finally, save your code and run the main.py file by running the following command in the terminal:

python main.py

Once the code execution is complete, you’ll see a sample.html file in your project’s root directory:

Markdown file to HTML file.

Converting HTML to Markdown in Python

Sometimes, a situation arises where you might want to convert HTML to Markdown. For this purpose, you can use the markdownify package in Python.

First, install the package by running the following command in the terminal:

pip install markdownify

Next, replace the existing code in the main.py file with the following:

# 1
import markdownify

html_string = '''
<h1>Hello World</h1>
<p>This is a great tutorial about using Markdown in Python.</p>
'''

# 2
markdown_string = markdownify.markdownify(html_string)
print(markdown_string)

In the above code, you are doing the following:

  1. Importing the markdownify module.
  2. Converting the HTML (html_string) to Markdown (markdown_string) using the markdownify method from the markdownify package.

Finally, save your code and run the main.py file by running the following command in the terminal:

python main.py

Once the code execution is complete, you’ll get the Markdown output:

HTML to Markdown.

If you see the output above, you’ll see the headings (<h1>) created with the "underlining" with equal signs (=) instead of starting with hashtags (#). This is because Markdown comes with two styles of headers: Setext and atx, and by default, the Markdown parser uses Setext-style headers. You configure markdownify to use ATX-style headers by passing the heading_style='ATX' parameter to the markdownify method.

Markdownify also supports a number of options, including HTML tag stripping, HTML tag conversion, Markdown heading styles, and more.

Converting an HTML File to Markdown in Python

Previously, we converted a Markdown file to an HTML file. However, sometimes, you might need to convert an HTML file to a Markdown file.

To do so, first, create a sample.html file and add the following code to it:

<!DOCTYPE html>
<html lang="en">
<body>
    <h1>Hello World</h1>
    <p>This is a <strong>HTML</strong> file.</p>
    <a href="https://honeybadger.io/">Visit Honeybadger</a>
</body>
</html>

Next, replace the existing code in the main.py file with the following:

import markdownify

# 1
with open('sample.html', 'r') as f:
    html_string = f.read()

# 2
markdown_string = markdownify.markdownify(html_string, heading_style='ATX')

# 3
with open('sample.md', 'w') as f:
    f.write(markdown_string)

In the above code, you’re doing the following:

  1. Reading the sample.html and storing its content in the html_string variable.
  2. Converting the HTML (html_string) to Markdown (markdown_string) using the markdownify method from the markdownify package.
  3. Creating a sample.md file and writing the Markdown (markdown_string) to it.

Finally, save your code and run the main.py file by running the following command in the terminal:

python main.py

Once the code execution is complete, you’ll see a sample.md file in your project’s root directory as follows:

HTML file to Markdown file.

Reading Markdown Front Matter in Python

In the world of Markdown, there are often some variables or metadata associated with a Markdown file. This is known as front matter. Front matter data variables are a great way to store extra information about a Markdown file. For example, a blog’s markdown files can have front matter variables like Title, Author, Image, Published At, and more.

You can specify front matter at the beginning of a Markdown file by placing the YAML data variables between triple-dashed lines. For example,

---
title: Hello World
Author: John Doe
Published: 2020-01-20
---

In Python, you can parse Markdown front matter with the python-front matter package.

To see this package in action, first, install the package by running the following command in the terminal:

pip install python-frontmatter

Next, add the following front matter to the sample.md file:

---
title: Hello World
date: 2022-01-20
---

Next, replace the existing code in the main.py file with the following:

# 1
import frontmatter

# 2
data = frontmatter.load('sample.md')

# 3
print(data.keys())
print(data['title'])
print(data['date'])

In the above code, you are doing the following:

  1. Importing the frontmatter module.
  2. Reading the sample.md file using the load method from the frontmatter package and storing the result in the data variable.
  3. Accessing the front matter variables with the help of data.keys(). Since data is a dictionary, you can also access the individual keys (data['title'] or data['date']).

Finally, save your code and run the main.py file by running the following command in the terminal:

python main.py

Once the code execution is complete, you’ll get the output of the front matter variables as follows:

Markdown front matter data.

Updating Markdown Front Matter in Python

Sometimes, a situation arises where you might want to convert HTML to Markdown. For this purpose, you can use the Python’s markdownify package.

You can also update the existing front matter data variables or add new ones using the front matter package.

To do so, first, replace the existing code in the main.py file with the following:

import frontmatter

# 1
data = frontmatter.load('sample.md')

# 2
data['author'] = 'John Doe'

# 3
data['title'] = 'Bye World'

# 4
updated_data = frontmatter.dumps(data)

# 5
with open('sample.md', 'w') as f:
    f.write(updated_data)

In the above code, you are doing the following:

  1. Reading (frontmater.load()) the sample.md file.
  2. Adding a new key (author) to the front matter data variable and assigning it a value (John Doe).
  3. Updating the existing key (title) and assigning it a new value (Bye World).
  4. Serializing (frontmatter.dumps()) the data variable to a string and storing the result in the updated_data variable.
  5. Updating the sample.md file by writing the updated Markdown (updated_data) to it.

Finally, save your code and run the main.py file by running the following command in the terminal:

python main.py

Once the code execution is complete, check the sample.md file for the updated front matter data, as follows:

Updated Markdown front matter data.

Using Python Markdown Extensions

The python-markdown package also supports extensions that allow you to modify and/or extend the default behavior of the Markdown parser. For example, to generate a table of contents (TOC), you can use the toc extension. There are other extensions, as well, which you can make use of based on your requirements.

To create a TOC for your Markdown content, first, replace the existing code in the main.py file with the following:

import markdown

# 1
markdown_string = '''
[TOC]

# Hello World

This is a **great** tutorial about using Markdown in [Python](https://python.org).

# Bye World
'''

# 2
html_string = markdown.markdown(markdown_string, extensions=['toc'])
print(html_string)

In the above code, you are doing the following:

  1. Specifying the [TOC] string in your Markdown (markdown_string) where you want to add the table of contents.
  2. Adding the extensions parameter to the markdown method from the markdown package and specifying the extensions (['toc']) you want to use.

Finally, save your code and run the main.py file by running the following command in the terminal:

python main.py

Once the code execution is complete, you’ll get the HTML output with the Table of Contents as a list:

Table of Contents.

Conclusion

Learning to work with Markdown can help you in lots of ways. Using Python, you can automate many tasks, including maintaining and manipulating Markdown files. For example, you can write a script that creates an index for all of your Markdown files in your blog or organize your Markdown files into different directories based on the front matter data variables, such as tags/categories.

Honeybadger, which is a cloud-based system for real-time monitoring, error tracking, and exception-catching, also uses Markdown to maintain our documentation. In case you are interested, we wrote a blog post in which we talk about how we built a documentation workflow in Rails.

What to do next:
  1. Sign up for a FREE Honeybadger account
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Try Honeybadger for FREE
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    Ravgeet Dhillon

    Ravgeet Dhillon is a Full Stack Developer - React, Vue, Flutter, Strapi, Python, MySQL, and Technical Content Writer. He runs a one-man freelance agency, RavSam.in, via which he helps startups, businesses, and open-source organizations with Digital Product Development. He loves to play outdoor sports and cycles every day.

    More articles by Ravgeet Dhillon
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Try Honeybadger for FREE
    No credit card needed - Simple 5-minute setup

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like:

    “We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release.” 
    Michael Smith
    Try Honeybadger for FREE
    Are you using Sentry, Rollbar, Bugsnag, or Airbrake for your monitoring? Honeybadger includes error tracking with a whole suite of amazing monitoring tools — all for probably less than you're paying now. Discover why so many companies are switching to Honeybadger here.
    Try Honeybadger for FREE
    Stop digging through chat logs to find the bug-fix someone mentioned last month. Honeybadger's built-in issue tracker keeps discussion central to each error, so that if it pops up again you'll be able to pick up right where you left off.
    Try Honeybadger for FREE
    "Wow — Customers are blown away that I email them so quickly after an error."
    Chris Patton
    Try Honeybadger for FREE