A beginner's guide to Markdown

2023-07-18

Markdown is something I use almost everyday. I type notes in Markdown. I type blog posts in Markdown. Sometimes I dream in Markdown.

Given this, I wanted to write—in Markdown, of course—a "beginner's guide", explaining why it exists, how it works, and why you, too, could benefit from writing in Markdown.

Hopefully you find this interesting, even if you're already a Markdown aficionado like myself.

What is Markdown?

Simply put, Markdown is a file format that allows you to add formatting elements to a plain text document, which can be converted to HTML (Hyper-text markup langauge). It was invented by John Gruber in 2004.

The formatting elements you can specify in Markdown include all the classic ways to format and organize text, most of which you're likely familiar with: bold, italics, lists, quotes, links, headings, horizontal rules, and more.

Basically, Markdown is just like a text file, but with some extra symbols in it that specify how the text should be formatted when read.

But wait, why do you need formatting? Why is text not enough on its own?

Text is great on its own, but sometimes you want text+ (that's also the name for text's latest streaming platform, although in my opinion the content is a bit wordy).

Think about when you talk: there's often a lot more being communicated than just words, such as intonation, emphasis, and gestures. Ideally you'd want to capture some of that in text, but without having to explicitly say, "I'm putting emphasis on that word". Hence, we format the text, for example by bolding certain words to convey emphasis.

Or take an online article, which is a prime use-case for Markdown: some of the text of the article is a heading, some of it is a quote from another source, and some if it is linking to other websites.

The way you know that is through the formatting—visual styles applied to the text—such as font-size, font-weight, spacing, layout, etc. These give the reader context clues, structure, and an idea of what the author wants to emphasize. Importantly, they convey that meaning implicitly through presentation, giving the reader the important information without interrupting their flow.

Without formatting, you'd have one big wall of text, which no one wants to read—or write.

Markdown simply gives you a way to specify formatting as part of the text itself.

How to denote formatting elements with Markdown

In Markdown, the way you denote formatting elements like bold, italics, and headings is through the use of non-alphanumeric symbols that you find on your keyboard, such as "_", "*", ">", and "#".

The most basic element in Markdown is the paragraph, which is simply indicated by a line-break before and/or after some text. Pretty easy, right? Just write a few paragraphs with line breaks between them and you're already writing Markdown!

For some, that might be all the formatting they need. But for the rest of us, there are a few more Markdown symbols to know. You can find all the basic Markdown syntax here but I'll go through a few of the symbols here for demonstrative purposes:

  • Headings are denoted by the pound/hashtag symbol (#) before the heading text, with the number of hashtags corresponding to the "level" of the heading (1 is the most important, 6 is the least important).
  • Bold is indicated by two astericks (*) before and after the text to be bolded.
  • Italic is indicated by underscores (_) or single astericks (*) before and after the text to be italicized.
  • Blockquotes are denoted by a greater-than sign (>) before the quoted text.
  • Ordered lists are denoted by a number with a period next it before each list item (e.g. "1."). Unorded lists are denoted by a dash (-) before each item.
  • Images are an exclamation point and then brackets with the alt-text, i.e. a summary, of the image, and parenthesis with the the URL of the image, or where the image is hosted, e.g. ![A picture of Brian](/images/Brian.png).
  • Links are brackets with the text that is a clickable link, and then parenthesis with the URL of the link: [A link!](https://www.numberthink.com).
  • Code is denoted by enclosing the text in backticks (`), or three backticks for a codeblock.
  • A horizontal rule (e.g. a line across the page that indicates a division between the content above and below it) is indicated by three or more astericks, dashes, or underscores on a line by themselves (___).

This is just a good starting point. There could be more formatting elements, or slight variations in how they're denoted, depending on the "flavor" of Markdown being used.

What are the benefits to including the formatting explicitly in the text document itself?

Making something bold using astericks is great and all, but why not just use a word-processor to do the formatting instead of adding all these funky symbols to the text?

Word-processors, sometimes referred to as WYSIWYG (what you see is what you get) text editors, like Microsoft Word, are wonderful inventions. To add formatting to text in a word-processor, all you have to do is highlight the text and press some buttons on a toolbar (or use a keyboard shortcut) for the desired formatting to be applied.

This works great so long as you're saving or printing your document from Microsoft Word (or whatever other word-processor). But when you want to move that text document to a different platform, such as the web, it might be hard to bring all those formatting elements along with the text. (True, Microsoft Word has an "export as web page" feature but that comes with limitations)

This problem with a word-processor is that you don't actually see how the formatting of the text is being encoded, unlike Markdown in which the formatting is explicitly denoted within the text. You can see the formatting when you're editing inside the word-processor, but when you save and close the file, how is the formatting being stored so that it's there again when you reopen it?

The truth is you don't know (unless you wrote the word-processing software yourself). The formatting is "hidden" behind the text, and stored as some inscrutable combination of ones and zeroes (like all code). When you want to reopen the document, only the word-processor's software knows how to convert those ones and zeroes into the intended text formatting again. Thus your dependent on the word-processors proprietary software to be able to see—and have other's see—the formatting you added. Sure, you could export it to another file format, like pdf, but then you can't edit your document anymore, or are dependent on more proprietary software.

In short, a word-processor condemns you to relying on special software to be able to view and edit your document in perpetuity. While you might not have an issue viewing and editing your document now, there's always a chance you'll lose it, or have trouble opening it at some point in the future when change or update your device. Things go wrong with software all the time, so the less software your work is dependent on, the better.

This is why the simplicity of Markdown can make text more portable and durable: it's just a plain text document, nothing else behind the scenes. All the information in the file is visible as text. Since it's just a text file—one of the most universal and widely used file formats—it's not vulnerable to becoming incompatible or obsolete in the future. There's no special software needed to re-open it, edit it, and retain all its formatting.

But what if Markdown goes out of style, and everyone forgets what the symbols mean? Then won't the text I wrote, and its formatting, become obsolete?

That's a good point, and risk worth considering, but not all that worrisome in my opinion. Markdown has been around since 2004 and has only grown in popularity and usage.

This is a chart of Markdown search frequency over time. There's a clear trend up and to the right:

Markdown search popularity

There's also a large and growing list of applications that support Markdown, such as GitHub (the ReadMe.md file found in many repositories is Markdown-based), reddit, or the developer blogging platform dev.to. You can find many more Markdown-friendly tools and applications here.

But even if Markdown goes obsolete—which, again, there is no reason to believe it will—what Markdown is converted to, HTML, will almost certainly not go obsolete. HTML is the fundamental building block of the World Wide Web, and the web isn't going anywhere anytime soon.

So you can rest assured your Markdown content is safe and sound, and can be read long after you and I are gone.

Why write text that is meant to be consumed as HTML?

Like I mentioned earlier, Markdown is meant to be converted to HTML when you're ready to publish your content.

Markdown is for authoring, not reading. Not everyone knows Markdown syntax, and it's easier to read text that is actually formatted, rather than plain text with a bunch of symbols in it.

This is really the whole point of Markdown. It's not just a shorthand for formatting text; it's a shorthand for formatting text that directly corresponds to HTML elements.

But why should HTML, of all formats, be the final, consumable version of your document?

HTML as the ultimate document format

To format text, you need a markup language, which is a text document with symbols in it that determine its structure and formatting.

In other words, you need a more than just text, you need a structure around it that tells you—or, more accurately, machines—what that particular text is, and how to present it for an end-user. In practice, you probably also need something like a style sheet, which tells a browser how to format the specific elements in the markup document.

But what makes HTML the best markup language?

HTML is a special markup language because it's the original, and still most widely used, document format on the World Wide Web. This means billions of devices all over the world can parse and display HTML, making it easy to share your content with almost anyone in the world. HTML is also special because it's not just text, but hyper-text: text that can link directly to other HTML documents. This makes it more desirable to have your content in HTML because it can link to all the other HTML content in the world.

And because of network effects, HTML just keeps getting more entrenched as a document format. Much of humanity's information is already in HTML, so it makes sense to also host your content in HTML on the web, so that it can link to, and be in conversation with, all the other information. This causes yet more information to be encoded as HTML, making it an even more desirable document format, and so on.

There's still software needed to take an HTML document, along with any CSS, and render it as the pretty, formatted web page you see on screen. But there will always be a leap between the text document you create, and what an end-user sees on screen. If you want your written words on screen for others to read, you have to rely on some technology to make that happen.

HTML, along with CSS and other web technologies, is simply the best, most robust option for rendering text, because the software that does it—web browsers—are ubiquitous, operating-system agnostic, well-documented, and have millions of developers contributing to and building on top of them.

All of this makes HTML the best document format for your published work if you want it to be widely-read, accessible, robust to change, and interlinked with as much other content as possible.

Why not just write HTML?

So far, I've established some of the benefits of Markdown to be:

  • It puts the formatting of text explicitly in the text document, so there's no reliance on word-processing software.
  • It converts directly to HTML, which is the de facto document format for humanity's information.
  • The only software it does ultimately rely on, web browsers, are so widely used, so well developed and documented, that they--and the HTML documents they render--will likely never go obsolete.

But why not just write HTML in the first place?

The answer is simply that HTML is tedious to write and edit.

Every element of an HTML document—which ranges from a whole paragraph, to a single italicized word, to non-prose elements like buttons and videos—has to be enclosed in brackets and an element "tag".

For a paragraph element, represented by the p tag, that looks like <p>Paragraph here.</p>. For a heading 1 element—the most important heading on a page—that looks like <h1>Heading here</h1>. Not only are these brackets and tags tedious to type out, but the document quickly becomes hard to read and edit when the text is cluttered with all those tags.

The HTML element tags are great for a machine to quickly understand and format a text document, but as a human, they're not all that pretty to look at or type out.

So instead of manually writing out those element tags, some clever developers made a collection of symbols that correspond directly to HTML element tags, but are much more concise and easy to type as you're writing. This is how Markdown was born.

Confusingly, this actually also makes Markdown a markup language. But unlike other markup languages, Markdown is purposefully lightweight and human-friendly. The symbols are short and sweet, and allow you to insert formatting into text without too many extra keystrokes, and without messing around in a toolbar that might disrupt your writing flow.

And once you memorize the basic Markdown syntax (which isn't too hard, there are only 10 or so basic syntax elements), you won't need to interrupt your writing to look up a symbol. You can write completely formatted text in one fell swoop, no fancy software or interface needed—just plain ol' text.

Markdown, in other words, is human-readable (and writable) text that gets converted to machine-readable text.

Converting Markdown to HTML

This does mean, though, that there's one more software dependency when it comes to using Markdown: the code that converts Markdown to HTML.

Like any software, things can go wrong in this process, and there are always edge cases, bugs, and slightly different implementations. But because Markdown is so widely-used, there are many well-documented Markdown parsers. Nowadays they work extremely fast, so you can write Markdown and see the HTML output instantly as you write.

In a lot of cases where you might use Markdown—such as GitHub or reddit—you don't even have to worry about converting it to HTML. The platform takes care of that for you.

Conclusion

In summary, Markdown is the fastest, most portable, and most universally recognized way to write formatted text.

When it's converted to HTML, you get text that is machine-readable, and when published on the web, interlinked with all of the other machine-readable text in the world. So you get encoded text, in the sense that it is outputted as code that machines—particularly web browsers—can display.

Since text is arguably the closest thing to human thought, Markdown is really the best and fastest way to encode thought. At least I think. Luckily, I have Markdown, so I can quickly write down thoughts like these and share them with others before they dissolve into the mists of time.

Anyways, Markdown's probably not going anywhere for a while, so you can safely put your best work in Markdown. It will still be understood, shared, and machine-readable for many years to come—for better or worse.