prd.md - macrodown

You are about to implement a markdown processor in C++ called
“MacroDown”. The main feature besides markdown is it will support
macros (TeX-like). The processor will output standard HTML.

The processor will implement the CommonMark syntax. Specifically you
should implement the parsing strategy outline in
https://spec.commonmark.org/0.31.2/#appendix-a-parsing-strategy.
However with this processor, you should convert all the all the syntax
elements (emphsized text, bold text, quotes, code, etc.) into macros,
and contruct a syntax tree. At the last step, evaluate the macros into
HTML. A macro is basically a function that can have arguments, and can
expand into text and more macros. For example, I can define a macro

```
%def[my_macro]{t1, t2}{It’s a %em{%t1} that is %t2.}
```

This code should define a macro called `my_macro`. It takes two
arguments, `t1` and `t2`, and it expand the body with these arguments.
If I write in my document `%my_macro{test}{good}`, it should expand
into `It’s a %em{test} that is good`, in which there’s another macro,
`em`, which will in turn expand into something else. Of course `def`
is a special case that is used to define macros.

In short, the syntax to define a macro should be
```
%def[name]{arg1, arg2, ...}{body}
```
The syntax to call a macro should be
```
%name{arg1}{arg2}...
```

In order for a CommonMark document to parse into HTML in this way, we
will need to have a “standard library” of macros that implements the
basic markdown constructs like headings, emphasis, etc.

Note that a MacroDown document will mostly look like a normal
CommonMark file, but a writer could also write macros definitions and
calls in it.

Naming style:

* `CapCase` for classes and types
* `snake_case` for variables
* `UPPER_CASE` for global constants
* `camelCase` for functions. If the function name is just one word,
  use lower case.
* All file names should be `snake_case`.

Unicode should be handled properly.

Use Cmake as the build system. Exmaple cmake file:
https://github.com/MetroWind/planck-blog/blob/master/CMakeLists.txt.

The syntax tree should have a function to iterate over all nodes.

The library should expose an interface that allows the user to render
a document with 2 steps: The first step will expose the syntax tree
(the root node), and the second step will render the syntax tree into
HTML.

## Custom markups

An interface for the user to define custom markup should be exposed.
The user will be able to define two kinds of markup:

1. A markup starts with a prefix. For example, the user can define `#`
   as a prefix, then in `It’s a #test.`, `#test` is the part that’s
   being marked up. In other words, for a prefix markup, the text
   that’s marked up begin with the prefix, and ends with any
   whitespace or punctuation (excluding the underscroe `_`, dash `-`,
   at-sign `@` and dot `.`).
2. A delimited markup. This kind of custom markup has starts with a
   character and ends with the same delimiting character. No
   whitespace or punctuation are allowed in between, except for the
   underscore `_` and the dash `-`. If whitespace or punctuation is
   found between the delimiting character, it’s not a markup. For
   example, suppose the user defines `:` to be a delimiter, then in
   `it’s a :test:.`, `:test:` is the part that’s being marked up;
   however in `it’s a :test.:`, nothing is marked up.

In both cases, the special character is a single unicode character,
which could be multi-byte. The markup is defined with a function, and
is converted into a macro with one argument, the argument being the
text being marked up, without the special character. The first kind is
defined with `definePrefixMarkup(const std::string& prefix, const
std::string& macro_name)`. The second kind is defined with
`defineDelimitedMarkup(const std::string& delimiter, const
std::string& macro_name)`. In the example with `#` defining a prefix
markup, the defining call would be

    definePrefixMarkup("#", "tag");

which would transform `#test` into a macro `%tag{test}`.