Markdown: the lightweight text format you're (probably) already using
Let's start at the beginning one last time
What if you wanted to write a document, but wanted a lighter interface for composition than a word processor? What if you wanted to just write text instead of worrying about format compatibility, font considerations when sharing across devices, or breaking your entire document with mild visual changes -- but still wanted rich text? What if you wanted something that you could easily publish to the Web, but didn't want to use HTML to compose long-form content?
Enter Markdown.
Markdown is a lightweight format, created by John Gruber and Aaron Swartz, for text composition that decorates text by wrapping it in punctuation. It's plain text, that can be composed in any plain text editor -- but the real party trick is that the syntax is designed to convert to and from HTML. So by writing in Markdown, you can write Web content without actually needing to know how to structure (or style) a Web page.
And the truth is, you've interacted with Markdown content already.
If you've ever used...
- Discord
- Slack
- GitHub
- Trello
- Notion
You've written messages or notes or comments using a dialect of Markdown. (I'll get further into the weeds on what I mean by that in a bit.)
You're also reading content written in Markdown. Right now.
Because to put it into a sentence: the purpose of Markdown is writing, for the Web.
Whip, and release (So how does it work?)
To be clear up front, this isn't aimed to be a comprehensive guide. Unlike that detailed piece on Web scraping, I'm really not aiming for several thousand words on this topic. Markdown Guide is a good starting point if you are looking for something more thorough. It will touch on some deeper topics, but treat this as more of a sampling. (I also wrote this in part for an A+ test prep group I've been working with recently, so it's not strictly written for casual users.) I'll try to mark off the sections where you might want to skip if you don't care about getting more in-depth.
(This is also a loose adaptation of what I'd initially outlined as a talk. The overall flow of what it turned into is pretty heavily influenced anyway by the excellent No Boilerplate, which I would highly recommend checking out.)
Like I said, Markdown decorates text mostly by wrapping it in various punctuation. You get italics by *wrapping with one asterisk*
, bold **with two**
, and in many dialects you can do either of the above using underlines if you prefer. Given the option, it may even be useful __*to blend them*__
for clarity's sake when you're using more than one together.
What about that monospaced font you just saw, you wonder? (I assume.)
You use backticks like `these ones` for that.
And if you're currently wonder what the hell that was...
```md
It's called a fenced code block.
Start and end the block with a line of three backticks,
and everything inside it will be formatted this way
```
You can use those to separate out... well, blocks of code. Additionally, you can specify a language by adding it on the opening line, like above. You can use the language's full name, or just its file extension like I'm doing in the example.
How about lists? To condense a few of these things into a short answer, this:
## ISO Standard Urban Groceries
<!-- this is a comment -->
* [bread](https://tvtropes.org/pmwiki/pmwiki.php/Main/ISOStandardUrbanGroceries)
* eggs
* milk
* [squick](https://tvtropes.org/pmwiki/pmwiki.php/Main/BreadEggsMilkSquick)
Is equivalent to this:
<h2>ISO Standard Urban Groceries</h2>
<!-- this is still a comment -->
<ul>
<li>
<a href="(https://tvtropes.org/pmwiki/pmwiki.php/Main/ISOStandardUrbanGroceries">
bread
<a>
</li>
<li>eggs</li>
<li>milk</li>
<li>
<a href="https://tvtropes.org/pmwiki/pmwiki.php/Main/BreadEggsMilkSquick">
squick
<a>
</li>
</ul>
You can nest list items by adding an indent, optionally use dashes instead of asterisks, and get an ordered list by using numbers instead.
Images work like this:
![Carbonated Beverage Language Map](https://imgs.xkcd.com/comics/carbonated_beverage_language_map.png)
Which is equivalent to this:
<img src="https://imgs.xkcd.com/comics/carbonated_beverage_language_map.png" alt="Carbonated Beverage Language Map">
And renders this:
(This is from the excellent xkcd, by the way.)
I should note here that there isn't, to my knowledge, native support for other image attributes, or for modern <picture>
and <source>
tags, so it might not be robust enough for your use case if you need, say, image fallbacks.
As less of an aside, curious console browsers might notice that aside is... actually an <aside>
. That's actually not generated using Markdown. And even if you didn't pop that open until just now (or still aren't), you might still notice that the comments are the same across both of the list examples. The reason for both of these things is that Markdown is actually a superset of HTML, and HTML itself is valid Markdown.
That's important, because you can still pepper bare elements throughout your documents as necessary if you need specific page structure, or that image functionality I mentioned, or even just targets for styling.
Speaking of which, because it's ultimately HTML, it's also styled the same way. In fact, most of what you're currently looking at is just hand-rolled CSS. (The one exception is the code fences, which use a third-party theming tool -- I draw the line somewhere at doing syntax highlighting manually.)
Spider-Man pointed first! (Variants)
So when I said dialects earlier what I mean is that Markdown is actually a loose collection of sodas formats. The original implementation is fairly minimal, designed to prioritize readability, and most importantly it's free, so various derivatives have been developed over the years to either extend the language or bring in support for other projects with overlapping purposes.
Specifically, Markdown is open source -- meaning that it's licensed in a way that enables anyone to freely use, redistribute, and modify it to create derivatives. (This will come up again later.)
It's not critical that you know all of their nuances, but when you hop around various parts of the Internet that interface with it, it can be helpful to be aware that it's not a monolith, and the flavor you're using on, say, Reddit, might not map 1:1 with what you're doing on Slack.
Think of it like English: you'll see differences in usage from one region to the next, even if the broader conventions are mostly the same.
To give you a brief rundown -- getting into some detail you might want to skip:
CommonMark is one... well, common example. It's an effort by the maintainers several major community projects to develop a formal specification around the language. It aims to clarify some ambiguities in the original, define priority when there's overlapping markup, and standardize some of the extensions that had been built around the language by various third parties. One example from above that's found in the CommonMark spec is fenced code blocks.
Another widely used derivative is GitHub Flavored Markdown. It builds on the CommonMark specification, adding support for strikethrough, tables, and task lists, among others. An important distinction is that it filters out several raw HTML elements when being displayed as a security measure, notably including <script>
.
A few other examples of extensions to the language worth being aware of:
- Front Matter is an extension that allows you to define data with your Markdown files: values like titles, publiation date, tags, or... whatever else you want, really, in place of needing to store that information in something more involved, like a database. It takes various fields that you would typically see in a CMS -- ones like
title
,date
,tags
, etc -- and optionally allows you to create and use custom ones. Front Matter leverages existing configuration languages like YAML and JSON (as well as a couple of others), allowing you to define these values in whichever you're most comfortable with.- Front Matter itself is also an extension to Visual Studio Code -- a popular open-source dev environment -- but since it's ultimately text there are several other tools that can parse it.
- MDX adds support for JSX, a templating syntax based on JavaScript.
- Notebook interfaces like Jupyter, Observable, and LiveBook -- interactive documents that enable you to embed and run code inside them, and then display their results.
- These are discrete from Markdown itself, but many environments that support Markdown (including some of the ones) also support LaTeX and Mermaid, adjacent text-based formats for presentation and diagramming. (Mermaid itself also explicitly sorts Markdown as inspiration.)
Additionally, Slack, Reddit, and others have their own particular spins on the language that don't exactly line up with the above. Some are simplified, some handle specific functionality, and they'll deviate from CommonMark in some ways. (Username mentions and spoiler tags are some examples that )
Welcome to the Spider Society (Markdown. Markdown Everywhere.)
Now, the ease of reading/writing .md
documents is great, but the real power is in its portability. The files don't store any formatting data of their own, unless you're manually adding CSS, so they're small in size, and you can use them anywhere that you can use plain text. Run across something that doesn't support Markdown? Change the file extension from .md
to .txt
, or even just copy to the clipboard, and be on your merry way. There's a multitude of different applications that support it out of the box, across basically any platform you could want to use. Web-based, of course, but there are also editors for every major desktop and mobile platform, and since it's plain text you can even use command-line options if you feel like it.
I can't call out every example here, and won't try, but there are a couple of text editors you might care to know about just for their broader impact.
A useful starting point is VS Code. This is a widely option that will run on every major OS; it comes with a variety of convenience features built-in, and for what it doesn't have there's also an extension interface. Programmer or no, you can use it to live-preview Markdown content, it has support for revision-tracking built-in, and it natively supports notebooks.
Some additional details below if you feel like a bit of a dive:
- The builds from Microsoft itself are not open source -- they ship with a few proprietary components, most critically the registry for extensions -- but the full code for the application itself is in a public GitHub repo under a permissive license.
- A fully open-source build can be found in VSCodium, and various other editors have been built on top of its pieces. It uses a distinct package ecosystem, the Open VSX registry, which is also used by many Linux flavors that ship with VSC, as well as by derivative projects.
- Its editor, Monaco, is also widely used in various online code playgrounds.
- Similar to the libretro example above, VSC also has extensible interface for (programming) language detection, called the LSP or Language Server Protocol, to provide common handlers for use in functionality like autocomplete and syntax highlighting, that has since seen adoption by other editors. This isn't limited to programming languages -- you'll also see it power functionality like this on Markdown documents.
Then there's Pulsar. Pulsar is even more malleable than VSC, treating every core part of the application as an extension (Or "package" in their parlance). It also shares lineage with VSC, being the direct continuation of an older project called Atom.
Atom was originally developed at GitHub before VSC absorbed much of its userbase and GitHub's acquisition by Microsoft (VSC's maintainers) finally shuttered the project. More importantly, it was the original impetus for the creation of a tool called Electron -- which enables building desktop applications from Web technologies. In a sentence, it runs a Chrome process (yes, that one) to provide a UI that can then access lower-level system functions. Just as much as Markdown, Electron is something you're probably already using; not only did Atom usher in its own competition, with VSC itself using it, but it also powers desktop clients for several of the examples I started this entry with.
Which brings me back to what's really magic about Markdown: It. Runs. Everywhere. And so if you ever need an escape hatch because some tool you're using just started putting up paywalls, you can just keep your content, and take it elsewhere.
Just as importantly, because it runs everywhere, there are tools across numerous categories that leverage this. Here's a scattering of them you a non-exhaustive list of ways this can be useful -- some of these tools run on the command line, but none of them require it. (I'll get into this more in the first one.)
In print it's libel (Document conversion)
You can use pandoc
to convert to and from a variety of different text formats, including: HTML, Word documents, OpenDocument, (EPUB)[https://pandoc.org/epub.html], several Wiki formats, and more. You can additionally output PDFs and slide decks from the contents.
Pandoc uses a more verbose format for Markdown by default -- it extends the syntax beyond much of what I've described above in order to preserve additional properties stored in HTML -- but it also supports several different Markdown variants. (Speaking of which, if you poke around CommonMark's site you'll find the Pandoc maintainer among the group maintaining the spec.)
There are various other ways to use this to preserve page content in particular, but I've already talked about examples in JavaScript elsewhere, and this isn't trying to get quite that deep.
A more interesting detail is that Pandoc itself has been extended into various other tools. PanWriter, for instance, is an editor that leverages Pandoc internally -- allowing you to write in plain text and output in whatever format you want, without you having to use a command line to do that. Tools like these are known as frontends -- which to put it a little too loosely refers to the UIs that you're interacting with in order to interface with some other tool that's running in a place not visible to the user. For some other widely used examples of this:
- Kodi and VLC are both frontends to a tool called
ffmpeg
-- which will convert, or transform, or play basically any form of media you can throw at it. - Emulation UIs like RetroArch and... Kodi again, via a plugin called Retroplayer, run on top of a library called
libretro
that provides a common interface for typical functionality that's needed, but not directly related to emulating hardware -- stuff like controller config -- and splits out the parts that are related to emulating hardware into modules called "cores." - Many websites are just frontends to a server somewhere that's providing you with content dynamically. So are the accompanying mobile or desktop apps, that are connecting you to the same platforms. (See: every form of social media.) So are the third-party clients that you might run across, whether they exist because there's first-party support (Reddit in particular used to be much friendlier to this) or just particularly enterprising users.
Much like many of these other asides, you'll see more of the concept as I go.
I'm something of a scientist myself (Knowledge management)
There's a wide array of different note-taking tools covering as wide an array of different paradigms, running the gamut from general notetaking applications to collaborative whiteboards to knowledge graphs. Below is a scattering of examples -- open source except where noted, covering a variety of different forms of notetaking. Some are widely used, others I just personally found interesting.
- Joplin is a general-purpose option that loosely resembles Evernote. It's multiplatform, and offers sync across a variety of services, including self-hosted ones.
- Obsidian and Roam each combine Markdown with Wiki-like backlinks to create a graph of the various knowledge in your workspace, and how those items relate to each other. Like many other options on this list, Obsidian also enables you to customize it with extensions. (That said, neither one is open source.)
- Logseq is an open source option offering a desktop client and an extension interface similar to Obsidian.
- Note that Logseq uses nonstandard Markdown, essentially defaulting to treating broken text sections as bullets rather than paragraphs. (These bullets are called "blocks" and each block can be directly linked.) It can be configured for long-form use, but it isn't the default experience.
- Affine is a note-taking tool that's broadly similar to Notion and Miro, offering both document editing and whiteboard interfaces. Notably, the team building this also maintains the entire editor UI as a distinct package called BlockSuite.
- HASH, a self-hostable knowledge management app built around a block protocol, which aims to build out an open component model for custom Notion-like blocks. (Notably, the Block Protocol is already supported in WordPress. GitHub also piloted it briefly as an alternative to standard READMEs, but isn't planning on moving forward with it.)
- SilverBullet is a customizable, Web-based note-taking tool built on Deno.
- Reor connects to locally-running AI models (or also to ChatGPT I guess...) to augment, query, and self-organize your notes.
Moreover, many of these use plain .md
files as their data storage, enabling you to simply edit the contents directly in anything you like, should you feel like switching or even just a change of scenery. That also means you're not tied to whatever syncing solutions these applications might offer, and are free to run whatever backup method best suits you.
This is a critical distinction with Markdown, and many other open tools: you choose your interface.
A beautiful web of life and destiny (revision tracking)
Up front: what I'm about to describe is more programming-adjacent -- it's a version tracking system that works with raw text files. That means it also works with Markdown! But it's much more commonly used with code -- it was built to aid with Linux kernel development, so its audience from the outset has always been programmers, and it doesn't shy away from technical detail. A lot of it's personally out of my depth, and this is my attempt to condense the basic usage... but it's still pretty dense, and you might want to skip this whole section if you want a lighter read.
Another popular example commonly seen on the command line is git
. With git, we can track and branch changes to text -- typically code, but you can use any plain text format, and Markdown is often a popular target. Git is an entire conversation of its own, but to attempt to put this into a few sentences: Git watches a folder for changes. Whenever you make edits inside it -- create, move, delete, or edit files, you can store those changes in checkpoints called "commits," with each taking a set of files you want associated with it and a message briefly describing what the change are for.
The files in a commit are selected manually, so instead of the system tracking every change you make, you're curating a history (or "commit log") that meaningfully describes what you did and when. Each commit can also be used to create branches that contain their own histories from that point on; these branches are also named, so you can use this system to travel through different save states across the folder's history. From there, there are various ways you can combine them. The whole of this history is called a repository (or "repo").
To better illustrate how one might use this, here's an example. With some of my spare energy, I've been working through a set of CTF (Capture the Flag) challenges -- essentially, simulated hacking puzzles. I have a repo on my local machine containing my notes, with two branches: no-spoilers
, and spoilers
. These are what they sound like, more or less -- one contains level descriptions from the website, and the other adds varying details on my solutions. From there, if I ever alter the base branch -- I can splice those changes into spoilers
(git merge
) or even replay the whole set of spoilers
changes on top of the latest version of no-spoilers
(git rebase
).
It can also be used to enable collaborative editing. Git can also run as a server -- enabling developers to copy (git clone
) existing repositories hosted on it, submit (git push
) their own, get updates for existing ones (git pull
) and submit changes to existing ones. The typical flow is to clone the repo you want to make changes to, put those changes into their own branch, and then send what's called a "pull request" asking to take your new branch and add its changes to some existing one on the source repo.
Like many of the other examples, Git comes in a variety of frontends. There are more modern terminal options like bit
and lazygit
, as well as standalone GUI tools like Git Cola and Gitnuro Code editors like VS Code, Pulsar, and others support Git out of the box. Git comes with its own Web interface, but there are a variety of others you can run (or join) in its place -- with many modern examples also containing various convenience functionality ranging from project management tooling to automation workflows. GitHub is an example of this, with some popular open-source examples including GitLab, Gitea, and Forgejo.
(No Boilerplate, linked at the top, has a whole entry on using these systems for organizational planning, focusing on GitHub in particular: The Unreasonable Effectiveness of Plain Text.)
Now, while you could use this for technical writeups, you could also just use it for books. If you, say, had a NaNoWriMo habit, you could use a system like this to maintain out different versions of a chapter, different chapter orders, and the like using different branches, instead of the standard college term paper practice of having every draft as a file called something like Chapter 3 draft 3a NOT FINAL MIX MOUNTAIN ROOT.docx
. (Flashing back to the time a Linux ISO and Google Docs -- which supports Markdown now! -- saved my ass after a hard drive crash.)
Speaking of books...
Hero or Menace? Exclusive Daily Bugle Photos... (Content publishing)
Not only can you output EPUB files directly using Pandoc, but site generators like mdBook will enable you to publish books as websites, by rendering your table of contents as pages you can navigate through for each chapter.
In addition to the variety of tools that can process and render Markdown content directly within the browser, there are static site generators across numerous ecosystems that can generate page content from Markdown, route them by filename, and manage content using Front Matter -- typically extending it with things like a layout
property that you can use to manage theming by linking to page templates. (You're looking at one right now. This entire site is built using Lume, an site generator that runs on top of Deno, a JavaScript toolchain I particularly enjoy. I'll probably gush about that in its own piece eventually, but not this one.)
You, of course, can also use Markdown to write technical documentation. Aside from more general-purpose tools for managing knowledge, there are also site generators like MkDocs and Docusaurus that are built specifically for this purpose. Moreover, several programming languages come with support for this out of the box -- mdBook for instance is maintained by the Rust team, and is used to maintain first-party documentation for the language.
And did I mention you can build slide decks in Markdown? Instead of fighting with PowerPoint, you can use tools like Marp, Reveal.js, and Slidev to build presentations that use line separators to define slides. They come with their own themes out of the box, but you can of course also define your own using CSS. (Like I said -- this post could have been a presentation.)
And all of this assumes you're not already just using one of the numerous blogging platforms on the Internet that natively support it like Medium or Notion or Ghost or Wordpress or...
Anyone can wear the mask (fin)
There are new tools being built around Markdown all the time. The nature of its extensibility is that it's sprawling, so much like how I can't tell you about every text editor, I can't tell you every use for Markdown. Obsidian, for instance, recently shipped an open format for infinite canvas layouts that stores text content in Markdown. Hell, I haven't even used all of the ones I listed myself, because there are always more cropping up. And so I realize that I've maybe just given you the equivalent of a Netflix menu to scroll through for a couple of hours, but my bet (even if I never personally got around to Kakegurui) is that you're going to come across something binge-worthy -- whether you found something fun in this guide, you stumble across something else that uses Markdown later, or you even just find that you're already using it somewhere else on the Internet that you call home.