Tugboat, Volume 35 (2014), No. 3 309 a Citation Style Language
Total Page:16
File Type:pdf, Size:1020Kb
TUGboat, Volume 35 (2014), No. 3 309 A Citation Style Language (CSL) workshop Attribution-ShareAlike 3.0 Unported license by Cre- ative Commons.5 For the search of specific citation Daniel Stender styles the online CSL style editor6 is quite useful Abstract because it provides, in addition to other features, a search by example. CSL is a free and open XML-based language for the programming of citation styles. With these styles, 3 Pandoc-citeproc bibliographical references can be printed out in dif- Among the several current applications which al- ferent ways from several database formats, including ready know how to use CSL styles for the automatic T X. The so-far over 7000 CSL styles which are Bib E generation of references, there is the popular univer- currently available can be used with several popular sal markup converter Pandoc [1].7 With short cita- applications like Zotero, Mendeley, or Pandoc. This tion keys like @doe2014 [p. 40-42] for its extended article is an introduction into the programming of Markdown lightweight markup, Pandoc can query citation styles with CSL, based on a few example bibliographical data files and recognize CSL styles to T X bibliographic database records. Bib E put out variously formatted references for documents 1 Introduction either in HTML,LATEX or ConTEXt markup, along with several others [3]. Pandoc is a command line Bibliographical references, as used in scientific publi- interface application; thus, the CSL style which is cations, are pointers to cited or regarded literature. going to be processed and the bibliographical data- Regularly, they consist of two standardized compo- base(s) are given as arguments in the program call. nents: an in-line citation (the \cite") refers to an Here's an example (some line breaks are editorial) for entry in the publication's bibliography. Despite the Pandoc's LAT X output of a random BibT X data- common concepts, there is no uniform outline for E E base record (see below for details), formatted using references; rather, each scientific discipline and ev- chicago-author-date.csl:8 ery publishing house has its own traditional set of conventions, which also might change between series. $ echo "On this, see @reference2 [p. 127]." \ In electronic typesetting, bibliographical infor- | pandoc --to=latex \ --csl=chicago-author-date.csl \ mation is often gathered in comprehensive, reusable --bibliography=references.bib data files. CSL1 is a programming language for ci- tation styles, with which differently formatted refer- On this, see Flom (2007, 127). ences can be generated from the same bibliographic databases. CSL (current version: 1.0.1) is XML-based, Flom, Peter. 2007. ``LaTeX for Academics and open and free, and was substantially developed for Researchers Who (Think They) Don't Need It.'' the all-around reference manager Zotero.2 \emph{TUGboat} 28 (1): 126--128. This article demonstrates how a rudimentary ex- As shown in this example, the bibliography is printed ample citation style could be implemented with CSL, at the end of a document. Incidentally, in the input with reference to the BibTEX data format. The usage (shown later), the title is given in lowercase; the of several programs refers to a Debian GNU/Linux titlecasing done here is automatic, a feature of this based system (like Ubuntu and Linux Mint), but style [8, chp. 14]. CSL styles could also be easily developed on other Processors which produce formatted citations operating systems. Some basic knowledge of XML out of bibliographic databases according to CSL styles and the BibTEX data format is definitely needed to are called CiteProcs.9 CiteProcs are being developed follow every detail. in several programming languages. The one which is used normally by Pandoc is pandoc-citeproc,10 writ- 2 CSL styles ten in the same functional programming language The citation styles which have already been imple- Haskell as Pandoc itself, and developed closely to- mented in CSL (file extension: .csl) are collected gether with it. This CiteProc (currently: 0.3.0.1) by the CSL developers in the official style reposi- already deals with a number of different database tory,3 and in the Zotero style repository.4 The so- far more than 7000 styles are distributed under the 5 http://creativecommons.org/licenses/by-sa/3.0/ 6 http://editor.citationstyles.org/about/ 1 http://citationstyles.org/ 7 http://johnmacfarlane.net/pandoc/ 2 http://www.zotero.org/ 8 Although given explicitly here, this CSL style is the 3 http://github.com/citation-style-language/ default in Pandoc. styles/ 9 http://en.wikipedia.org/wiki/CiteProc/ 4 http://zotero.org/styles/ 10 http://github.com/jgm/pandoc-citeproc/ A Citation Style Language (CSL) workshop 310 TUGboat, Volume 35 (2014), No. 3 formats, but it's said that it works best with BibTEX by version, while the class attribute determines that resp. BibLATEX [4] databases so far.11 this style provides cites in the running text by default, rather than as footnotes or end notes (which would 4 Developing CSL styles be \note"). CSL styles are XML, and therefore the whole related 6 Info block XML tool chain can also be used with them. For somebody who deals with XML regularly, a special- The next mandatory unit of a CSL style is an hinfoi ized editor is useful, but fundamentally CSL styles block, which provides metadata for labeling and iden- can be created and modified with any text editor. tification. A typical info block looks like this: CSL is described in detail in the specification [10], <info> and the primer which has been written by the devel- <title>An example CSL style</title> opers [9] is a good starting point for beginners. <id>http://www.danielstender.com/csldemo</id> CSL is standardized as an XML grammar in the <updated>2014-09-18T23:53:00+02:00</updated> schema language RELAX NG,12 and in principle any </info> CSL style file can checked with an XML validator Even if it is not intended to publish the style, there against the CSL schema to determine if it is correct are at least three mandatory child elements that are (valid).13 Unfortunately, some validators, for exam- meant for this purpose: ple xmllint (Debian package: libxml2-utils), cannot htitlei is the title of the CSL style as it is going to cope with XML schemes in RELAX NG compact syn- be displayed to users, tax like the one shipped by the CSL developers (file hidi contains, like xmlns in the CSL header (see extension: .rnc); the scheme has to be converted above), a random URI, which may be real or (e.g. with Trang) into the regular RELAX NG syn- fictitious, and which is solely for identification tax (file extension: .rng) before a CSL style can be purposes, validated against it: hupdatedi carries a xsd:dateTime compliant time $ git clone https://github.com/\ stamp15 of the last modification. citation-style-language/schema.git Cloning into 'schema' [...] 7 Example BIBTEX records $ trang schema/csl.rnc schema/csl.rng $ xmllint --noout -relaxng schema/csl.rng \ Before getting any deeper into CSL style program- chicago-author-date.csl ming, here are a few sample BibTEX records to be chicago-author-date.csl validates referred to hereafter to demonstrate how CSL works. A typical @Book entry type [5, chp. 13.2] goes like 5 XML declaration and CSL header this The standard XML declaration commonly starts a @Book{reference1, CSL style file: author = {Kopka, Helmut and Daly, Patrick W.}, title = {A Guide to LaTeX and Electronic <?xml version="1.0" encoding="UTF-8"?> Publishing}, Incidentally, although it is often suggested to be publisher = {Addison-Wesley}, included, specifying the UTF-8 encoding like this year = 2004, could be omitted because UTF-8 is the XML default address = {Boston}, [2, p. 28]. edition = {Fourth}} The next thing which is needed is a well-formed The next one is an @Article data set of a (well- CSL header to specify that the XML file is a CSL known) journal whose issues are counted as volume style. A standard CSL header goes like this: numbers: <style xmlns="http://purl.org/net/xbiblio/csl" @Article{reference2, version="1.0" class="in-text"> author = {Flom, Peter}, This defines hstylei to be the root element, and the title = {LaTeX for academics and researchers CSL namespace is set as the default with xmlns.14 who (think they) don't need it}, That the style is compatible with CSL 1.0 is stated journal = {TUGboat}, year = 2007, 11 cf. Readme.md. volume = 28, 12 http://relaxng.org/ number = 1, 13 http://github.com/citation-style-language/ pages = {126-128}} schema/tree/v1.0.1/ 14 At no time will a connection to this url be established; 15 http://books.xmlschemata.org/relaxng/ch19-77049. it's only for the purpose of identification. html Daniel Stender TUGboat, Volume 35 (2014), No. 3 311 And another @Article data set of a journal which ence out of the citation key in the document source. appears monthly and doesn't have any issue numbers: The cite is described within the element hcitationi, @Article{reference3, looking, for example, like this: author = {Sharma, Tushar}, <citation> title = {Why I never close Emacs}, <layout> journal = {Open Source For You}, <text macro="surname" suffix=" "/> year = 2014, <group prefix="(" suffix=")"> pages = {53-55}, <text macro="year"/> month = {jan}} <text variable="locator" prefix=":\,"/> The main body of a CSL style consists of several </group> instructions for how exactly the CiteProc is going to </layout> typeset citations, thus the cites and the correspond- </citation> ing references in the bibliography of a document, Along with the hlayouti element, hcitationi has an- from standardized data records like these.