Some basics of

Introduction Reading the stories posted on Chakatheaven and Furaffinity, I’ve repeatedly come across well- written ones where the presentation is a bit of a deterrence. Those gems include writing the body of the text in 16pt, a long text completely italicized and centring with tabs. Headings also are a bit of a sore spot.

In the case of plain text there is not a lot of formatting options but with more advanced formats, those options exist. As a consequence I’ve decided to write this mini-manual in order to improve the general readability of texts, followed by an overview on the software available to produce readable texts. I claim neither perfection nor completeness for this overview, but am reasonably certain to get at least the basics right. Of course with a good reason, deviation from those guidelines is entirely acceptable.

Formatting Text Alignment Fundamentally, there are four different alignments: Justified (most of this document), flush left, centred and flush right. For longer texts in languages written from left to right, flush left or justified are the only really worthwhile options. The difference between justified and flush left is the treatment of the text on the right side: Flush left keeps distances within the text (space widths, distance between letters) constant and produces a ragged right edge while justification varies those (and, if activated, hy- phenates words) to keep both sides aligned.

Font/ Traditionally, a typeface is what’s today usually also referred to as a : A family of different but related , usually bundled together. For example, Regular, Italic, Bold and Bold Italic are different fonts belonging to the same typeface, as is strictly speaking each individual font size and -weight. Better have individual font definitions for the main varieties, while weights in addition to Regular and Bold is a feature rather found in truly professional typefaces.

Generally, there are three really relevant families of typefaces to distinguish: typefaces, which have small lines at the end of each letter’s strokes, guiding the reader along the line while reading. They are often used in books and newspapers, where the reader faces a lot of text. Their problem is that they often don’t reproduce too well when a small font size meets a low resolution. This combination can result in the optical disappearance of delicate parts of the letters like thin lines (or the serifs themselves) when the font is too light for its size. Sans serif typefaces usually do not suffer from this problem – they have no serifs and usually a relatively homogenous line width. This makes them more legible but especially longer texts some- what less readable. They are rather used for signs, headings and electronic texts (especially online), where a low-resolution output medium is the norm rather than the annoying exception. Monospaced fonts are a bit the odd one out. They are not defined by serifs or their lack but rather by the uniform width of each character (glyph). The other families – blackletter, symbol, effect etc. – are suitable for special effects but usually a bad choice for longer texts.

In order to set the entire document, different typefaces can be used, either a single one for every- thing or a mixture as done here, sans-serif for headings and a serif font for the text. The main text should be set in a readable but not too big font size. The usual range is between 9 and 14 pt, this text is set in 12 pt.

Important side note: Typefaces set in the same nominal size can look quite different. For comparison (all in 12 pt): Times New Roman Georgia New.

Above all: Don’t get too fancy. As noted above, weird looking typefaces are OK for singular effects or headings, but for the bulk text use practical, readable ones like Times (New Roman), Georgia, , Libertine, , Charis SIL (serif), Arial, , Calibri, , PT Sans (sans serif) or the Liberation fonts (entire family including serif, sans serif and monospaced). They might look boring but if the reader decides not to bother with reading more than one page of blackletter or oddities like Exocet or San Fancisco, any prospective effect is somewhat moot.

Another complication is added by the availability of those typefaces on the reader’s system, so you shouldn’t rely too much on specific features of an uncommon font unless you bundle it with the document – however, this can lead to license problems if the typeface is not freely distributable.

If you release in an editable format, the reader can of course change the entire formatting to suit his preferences.

Emphasis For emphasis words or short passages can be set italicized (weak emphasis), bold (strong emphasis) or both. SMALL CAPS and ALL CAPS are possible too, but are primarily used in titles and headings rather than bulk text. Underlining should be avoided. Using a different typeface, a differently sized font, highlighting or colour is possible but problematic: Too many fonts can give a distract- ing, incoherent result and depending on printer or display used (b/w laser, eInk), colours can be lost. In general, don’t overdo it with both amount and variation or you end up with a mess like this paragraph.

Headings At the first glance creating a heading seems easy: increase font size, make bold and maybe even change the typeface.

Of course knowing a few tricks can make things a lot easier in the long run and especially for longer texts. If you want to write text properly structured by headings and subheadings, the proba- bly most important thing to know how not to do it. Do not simply assign the headings a different formatting manually, but use the template doing this automatically. This seems a little pointless at first, but once you want to change the headings’ format, changing all of them individually is a major nuisance and inconsistencies are a permanent potential. This also informs the program that those lines are headings in the first place, enabling nifty features like internal linking and automatically generated table of contents. In most word processors, this feature is located in a dropdown list near the ones for font size and typeface. New Paragraphs vs. Line Breaks Often used without distinction (or line breaks not at all) but as can be seen in the following exam- ple, there is a difference. Typically, the first line of a new paragraph is indented or the vertical spacing between paragraphs is greater than between regular lines.

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim.

Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus. Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem.

Just in case you want to know, the nonsense above is a particular and common placeholder text known as lorem ipsum, often used for layout examples.

Orphans and Widows The first or last lines of a longer paragraph standing alone at the top or bottom of a page are referred to as orphans and widows. They are generally considered layout errors and should be avoided; most word processors or layout software can prevent them automatically if the option is activated. Do it.

While you’re at it, make sure that headings are not separated from the following paragraph. This is apparently not set in every template although it should be in my opinion.

Ways of writing Text Plaintext The absolutely minimal solution :-). Single font, single size, the formatting possibilities of a type- writer but still problem potential with different character encodings and different line breaks. Advantages are minimal overhead and the fact that you effectively read and write the file directly, whether it’s program code or a layout description (HTML, LaTeX, Markdown). Also, even with incompatibilities, the text is usually still comprehensible (ASCII characters are normally the same). Using an editor capable of dealing with different standards like Notepad++, PSPad or jEdit is recommended.

Word Processors Examples are MS Word, OpenOffice/LibreOffice Writer, AbiWord, Office/WordPerfect Office, Kingsoft Office, SoftMaker Office, SSuite Office, Suite, Papyrus Autor/Papyrus Author and (possibly many) more. General In general those are WYSIWYG1 Editors, allowing to create a layout with a minimum of effort.  Relatively easy to use (WYSIWYG)  Built-in assistance for several tasks  Often a fairly wide choice of in- and output formats  Spell- and (sometimes) grammar checking  Invisible formatting can break the document internally – this can be tricky to fix.  Proprietary formats make cooperation difficult, compatibility can be a problem, vendor lock- in

Microsoft Office The likely most widespread commercial office suite, available for Windows and MacOS.  Tons of features ± Ribbon interface (new look and feel grouping options into different tabs rather than tradi- tional menus and toolbars, since Office 2007)  Costs money OpenOffice/LibreOffice Possibly the best-known free office suite(s), originally based on StarOffice.  Free  Many features  Many different formats supported SoftMaker Office A less well-known proprietary office suite with good compatibility to . The next version for Windows is currently in the public beta phase.  Good online support (including a public forum)  Available for multiple platforms (Windows, Linux, Android) with little difference between the versions  Installer only 110 MB, not much more space needed after installed on disk (SMO Standard 2012)  Competitive price, especially during special sales SoftMaker Free Office Office suite based on SoftMaker Office but with slightly reduced features, available for Windows and Linux. Free of charge for both private and commercial use but registration is required.

SSuite Office A full office suite which also contains the relatively author-specific tools QT Writer Express, a very compact and Writer’s D’Lite, a tool focused on writing a text. The suite’s word processor is available separately, as well.  Free (moneywise)  Compact. The bundle with the most tools included (Excalibur Suite) weighs only 42 MB, Wordgraph alone only 14.7.

1 What You See Is What You Get  Native windows software but can be used on Linux (WINE) and MacOS (CrossOver), too.  Unlike many other Word Processors, the choice of in- and output file formats is rather lim- ited, as a result document exchange is more or less limited to rtf and doc. Markdown A markup language with a highly reduced feature set and supposed to be always directly human- readable as well as easily convertible to (X)HTML. Unfortunately the original feature set was a little too reduced and a lack of response from the original developer resulted in different (and not necessarily compatible) extensions like Markdown Extra, MultiMarkdown, Maruku or Pandoc’s markdown.

LaTeX  free  extremely powerful  bad layout is difficult to achieve  editing possible with any plaintext editor  complicated and steep learning curve  Needs a template. Making one is difficult, adapting an existing one is easier but limits de- sign possibilities.  Not that many people use it.

Exchange formats Further editing If you send or publish a file with the idea of further modification, obviously an editable format is required and compatibility can be a problem. If everyone involved uses the same software, this software’s default format is the obvious choice but different platforms can introduce problems ranging from changed pagination up to damaged files.

HTML is platform-independent and can be edited with a plaintext editor. It is however not suitable to provide a really refined layout and code exported from word processors can contain a lot of data junk, making them much less directly editable by humans.

Markdown is easier to edit than raw HTML but at the core allows only a very limited set of for- matting options while the extensions… just have a look at the entry above.

The Rich Text Format is (at the core) a comparatively old format, used for document exchange across different platforms and is at least nominally understood by just about all word processors. As long as you stick to text only it does its job but using features beyond that, like pictures, problems can appear due to different implementations.

Microsoft doc is a binary format with only a partial specification publically available, support for additional features in other software required reverse engineering. A little ugly but it mostly works also for exchange of files.

Office Open XML is the new default format since MS Office 2007, internally it’s zipped xml.

OpenDocument is another internally XML-based format and has been adopted as standard by numerous organisations and governments around the world. LaTeX is best exchanged only with other LaTeX users, the output of automatic converters is usually not very editable at the code level, where editing usually takes place.

Fixed Layout For a layout, the most obvious choice is PDF. Ideally it should display identically on all systems, no matter what operation system or reader software. Practically errors happen if a feature is not supported or a particular font is not available. Also small devices have a problem: Their screens are often too small to adequately display a full page. Reflow formatting can address this to some extent, but the result is usually sub-optimal and more refined forms of editing or content extraction are somewhat difficult. For further editing, OpenOffice/LibreOffice can generate PDF files which are displayed normally by any reader software but contain the original odt as well.

With the increasing prevalence of e-book readers, the EPUB format is becoming increasingly popular. Internally zipped HTML, it is a bit of an inversion of PDF: The reader can and is supposed to tweak the layout to suit the output form. The price is a reduced control over the final look of the document. Also this format can be edited without too many headaches.

Other Tools, Converters Sigil A code-level Editor specifically for epub files, possible input formats are epub, txt and HTML. From personal experience: HTML and possibly txt files should be externally converted to UTF-8 before importing.

LanguageTool LanguageTool is a Java based Open Source proofreading software (spelling and grammar) for various languages including English, French, German, Catalan and Polish. It exists both as OO/LO plugin and as standalone software.

Writer2LaTeX Writer2LaTeX is a Java based converter from OO/LO writer documents to LaTeX. Available either as standalone application or Plugin for Open/LibreOffice. It is part of a package containing not just Writer2LaTeX but also Writer2xhtml and Writer4LaTeX. latex2rtf latex2rtf converts LaTeX to the RTF format understood by most (all?) standard word Processors. While the conversion is not perfect, it can extract the content including pictures.

PDF split and merge (pdfsam) A very useful piece of Java software to edit files on a page-by-page basis. Possibilities include not just splitting and merging but a number of other rearrangements. PDF shaper Another pdf toolkit which the (to me) primary distinguishing ability of converting/extracting a pdf to rtf while preserving the formatting. The practical usability of the conversion result depends on the input file.

MultiMarkdown MultiMarkdown converts (extended) Markdown into HTML/XHTML, LaTeX, OpenDocument Text and OPML

Pandoc A general markup converter described as swiss-army knife for converting between markup formats and also incorporates a bunch of extensions for markdown. It can also convert to markdown as well as to and from LaTeX – except when it doesn’t, which is a little annoying.