(2017), No. 2 Using Markdown Inside TEX Documents Vít Novotný

(2017), No. 2 Using Markdown Inside TEX Documents Vít Novotný

214 TUGboat, Volume 38 (2017), No. 2 Using Markdown inside TEX documents Vít Novotný Abstract Markdown is a lightweight markup language that makes it easy to write structurally simple documents using a clean and straightforward syntax. Although various tools for rendering Markdown documents via TEX exist, they tend to be built on top of TEX rather than in TEX. This paper briefly presents existing tools and introduces a macro package for plain-based TEX for- mats that takes a different approach. By making it possible to put snippets of Markdown-formatted text into arbitrary TEX documents and exposing TEX macros that control the rendering of Markdown elements, the package provides a convenient way of introducing Markdown into existing TEX workflows. 1 Introduction TEX is a fine tool for typesetting many kinds of documents. It may, however, not always be the best language for writing them. Markup languages based on SGML and XML make it possible for an author to focus on the content of their documents without having to worry about the error messages produced by commands and unforeseen macro pack- age interactions. The resulting documents can then be transformed not only to various TEX formats, but also to other output formats that bear no relation to the TEX world. When preparing structurally simple documents, however, SGML and XML with their bulky syntax may feel too heavy-handed. For these kinds of docu- ments, lightweight markup languages that exchange raw expressive power for clean and simple syntax are often the best choice. In this paper, I will focus on the lightweight markup language of Markdown [2]. Although the language of Markdown was origi- nally envisioned as an HTML preprocessor, its syntax is agnostic to the output format, which makes Mark- down useful as a general document markup language. Tools that provide conversion from Markdown to various TEX formats are therefore readily available. One of the better-known free open-source programs that enable conversion from Markdown to TEX is Pandoc [4]. Dubbed a Swiss Army Knife by its author, Pandoc enables the conversion between an impressive number of markup languages (e.g. LATEX, ConTEXt, HTML, XML Docbook) and output for- mats (e.g. ODF, OOXML, or PDF). The tool has already been reviewed in TUGboat by Massimiliano Dominici [1]. Vít Novotný TUGboat, Volume 38 (2017), No. 2 215 Pandoc is a powerful multi-target publishing As a result, users are limited in their ability to use software and its ability to perform lossy conversions TEX inside their Markdown documents, but there is (such as from LATEX to HTML) makes it extremely still plenty of rope left for halts, crashes, and external useful for document manipulation in general. How- command execution. ever, if our sole goal is to use Markdown markup Another inconvenience of Pandoc is its lack of inside TEX documents, Pandoc displays several weak- integration with the TEX distributions. TEX docu- nesses. ments without external dependencies and written The ability to redefine the correspondence be- in stable formats such as plain TEX require virtu- tween Markdown elements in the input and TEX ally no maintenance. The use of external assets and macros in the output is limited. Processing the fol- actively developed formats such as ConTEXt will re- lowing Markdown document: quire some attention each time there is a major TEX - Single underlines/asterisks denote _emphasis_. distribution release. Software outside TEX distribu- - Double them for **strong emphasis**. tions such as Pandoc throws more variables into the - The *two* __may be__ _freely *mixed*_. mix, since different versions may produce different in Pandoc v1.17.2 using the command pandoc -f output even if the TEX distribution stays the same. Besides the trickier maintenance, Pandoc’s absence markdown -t latex produces the following LATEX document: from TEX distributions also means that it is unavail- able in major TEX services such as the collaborative \begin{itemize}\tightlist text editors at http://www.sharelatex.com/ and \item Single underlines/asterisks denote http://www.overleaf.com/. \emph{emphasis}. \item Double them for \textbf{strong emphasis}. Other major tools for rendering Markdown, such \item The \emph{two} \textbf{may be} as MultiMarkdown, were reviewed and found to be \emph{freely \emph{mixed}}. plagued by similar design choices. With this knowl- \end{itemize} edge, I decided to prepare a macro package for render- ing Markdown inside T X that would take a different While it is possible to redefine the produced LATEX E macros in theory, altering base macros such as \item approach. The goals were: or \textbf may break the document in subtle ways. • to make it easy to specify how individual Mark- The output is also not fixed and may vary between down elements should be rendered, different versions of Pandoc. • to provide sandboxing capabilities, and Another desirable feature is sandboxing. Mark- • to make sure that the package required nothing down is a static markup language without program- more than what was present in standard TEX ming capabilities and may be used by ordinary users distributions. without much training. If Markdown documents 2 The markdown.tex package are submitted to a system and then automatically typeset using TEX, then these documents should cer- 2.1 Architectural overview tainly not be able to crash or halt the compilation, The block diagram in figure 1 summarizes the high- or to execute external programs using the \write18 level structure of the package. Working from the command and similar mechanisms. Pandoc does not bottom of the diagram upwards, I will now describe offer much in these regards, since it permits TEX the individual layers. macros in the input Markdown documents. There The translation from Markdown to TEX is done exist complex rules for deciding whether or not an by the Lunamark Lua library [3]. The library was occurrence of a TEX special character should be kept modified, so that it would not depend on external or removed; the following document: libraries and so that it would produce an intermediate This {will} 2^n \begin{get} r~moved and \this plain TEX representation of the input Markdown {won’t} \begin{equation}2^n\end{equation}$2^n$. document. This means that instead of Pandoc’s TEX when converted with Pandoc becomes: representation (repeated here from the previous page for ease of reference), This \{will\} 2\^{}n \textbackslash{}begin\{% get\}r\textasciitilde{}moved and \this{won’t} \begin{itemize}\tightlist \begin{equation}2^n\end{equation} \(2^n\). \item Single underlines/asterisks denote \emph{emphasis}. Apparently, the aim is to enable the use of mathemat- \item Double them for \textbf{strong emphasis}. ics and simple TEX macros while retaining baseline \item The \emph{two} \textbf{may be} compatibility with standard Markdown documents \emph{freely \emph{mixed}}. that may contain portions resembling plain TEX. \end{itemize} Using Markdown inside TEX documents 216 TUGboat, Volume 38 (2017), No. 2 2.2 Usage examples User code This section contains examples for markdown.tex (ver- sion 2.5.3). These should give an idea of the capabil- ities of the package. The examples are in LATEX for ConTEXt layer LATEX layer ease of exposition. As noted in the previous section, the LATEX layer of the package is reasonably thin and the examples can therefore be easily adapted for the plain TEX and ConTEXt layers. For brevity, Plain TEX layer the examples will contain only the body of a LATEX document assuming the following preamble: \documentclass{article} Lua layer \usepackage{markdown,graphicx} To typeset the examples, you can use the lualatex or Figure 1: A block diagram of the package pdflatex -shell-escape commands. For further information, see the package documentation [5]. the library will produce the following representation: Sandboxing disables hybrid markup and is en- abled by default. As a result, the following example: \markdownRendererUlBeginTight \markdownRendererUlItem \begin{markdown} Single underlines/asterisks denote $\sqrt{x^2 + y^2}$ \markdownRendererEmphasis{emphasis}. \end{markdown} \markdownRendererUlItemEnd will produce $\sqrt{x^2 + y^2}$. To enable the use \markdownRendererUlItem of hybrid markup, the hybrid option needs to be Double them for specified. The following example: \markdownRendererStrongEmphasis{strong emphasis}. \begin{markdown*}{hybrid} \markdownRendererUlItemEnd $\sqrt{x^2 + y^2}$ \markdownRendererUlItem \end{markdown*} The \markdownRendererEmphasis{two} will produce px2 + y2 as expected. You may also \markdownRendererStrongEmphasis{may be} create a partial sandbox; the following example en- \markdownRendererEmphasis{freely ables the use of non-breaking spaces: \markdownRendererEmphasis{mixed}}. \begin{markdown*}{renderers={tilde=~}} \markdownRendererUlItemEnd Bartel~Leendert van~der~Waerden \markdownRendererUlEndTight \end{markdown*} This representation has two useful properties: it With hybrid markup, using underscores and works with any T X format that uses the same spe- E backticks may produce unexpected results. That is cials as plain T X does and it represents the logical E because Markdown uses underscores for emphasis structure of the input Markdown document in terms and backticks for inline verbatim text, whereas LAT X of macros that can be freely redefined by the user. E uses underscores for math subscripts and backticks The plain T X layer exposes the capabilities E for opening quotation marks. Preceding characters of the Lua library as T X macros and provides de- E with a backslash disables their special meaning in fault definitions for the \markdownRendererhNamei Markdown, as the following example shows: macros from the intermediary representation. Macro package developers are encouraged to redefine the \begin{markdown*}{hybrid} \markdownRendererhNameiPrototype macros that \‘\‘This is a quote with $a\_{subscript}$.’’ \end{markdown*} correspond to the default definitions. When the LuaTEX engine is used, the Lua library is accessed di- This, however, makes the text difficult to both read rectly. Otherwise, the shell escape (\write18) mech- and write. Alternatively, you can disable backticks anism is used (see [5, sec.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us