Yanne Broux

ANCIENT PROFILES EXPLOITED FIRST RESULTS OF NAMED ENTITY RECOGNITION APPLIED TO LATIN INSCRIPTIONS*

Introduction

en years after its launch (AD 2006), Trismegistos (www.trismegistos. Torg) is bursting at the seams. What started out as an ambitious pro- ject to aggregate the metadata of a delineated corpus (only Egypt, 800 BC – AD 800), has evolved into a highly complex conglomerate integrating data on dozens of aspects of ancient sources and society. Over the past decade it has become clear that thanks to the ‘digital turn’, which, fortunately, many ancient historians were keen to embrace, almost anything is possible – with- in the limits of the source material, of course. This paper focuses on Trismegistos’ experiments with Named Entity Rec- ognition, in particular the recent developments in regard to the extraction of personal names and toponyms from the Latin inscriptions during the project ‘Toward a Universal Facebook of the Ancient World’. A short update on the Spurii and the spuri filii is added to illustrate the possibilities this new corpus of names opens up for research.

* I would like to thank the reviewers and the editors for their comments and suggestions. Funded by the Research Foundation – Flanders (FWO). 12 YANNE BROUX

Trismegistos and Named Entity Recognition: a brief history

Trismegistos’ initial goal was to unite papyrology, with its heavy focus on Greek, and Egyptology to advance the study of the multicultural and multi- lingual society of Egypt in the Graeco-Roman period.1 Disciplinary borders were dissolved by incorporating all texts, regardless of their typology, the language they were written in, or the material they were written on. Tris- megistos was never designed to provide the actual text or images of ancient sources; where possible it refers to its partners for this. Instead, the focus was (and still is) on textual metadata (information about publications, dates, provenance, language, material, etc.), as well as related information that was recycled from previous projects, such as the Prosopographia Ptolemaica and the Fayum Project.2 The main database, TM Texts, currently contains 149,926 documents from Egypt,3 and is – directly and indirectly – connected to more than 30 other databases storing information about or mentioned in these sources. In light of the project ‘Creating Identities in Graeco-Roman Egypt’ (2008–2012), the Prosopographia Ptolemaica, which had been incorporated in the subsection TM People, was expanded. All references to individuals in the Egyptian texts were to be collected, obviously an enormous task. Luckily, the Greek papyrological texts had been digitized and were available online through the Duke Databank of Documentary Papyri (DDbDP; now integrat- ed in the Papyrological Navigator4). Thanks to this digital corpus, the com- puter would be able to ‘tag’ names in these texts automatically, a process called Named Entity Recognition (NER).

1 For a more extensive overview of the development of Trismegistos (focusing on its core component: TM Texts), see M. Depauw & T. Gheldof, ‘Trismegistos. An interdisciplinary platform for ancient world texts and related information’, [in:] L. Bolikowski et al. (eds.), Theory and Practice of Digital Libraries – TPDL 2013 Selected Workshops [= Communications in Computer and Information Science 416], Cham – Heidelberg 2014, pp. 40–52. 2 Prosopographia Ptolemaica: http://ldab.arts.kuleuven.be/prosptol/pp and L.Mooren , ‘The automatization of the Prosopographia Ptolemaica’, [in:] I.Andorlini , G. Bastia- nini, M. Manfredi & G. Menci (eds.), Pap.Congr. XXII, pp. 995–1008; the Fayum Project: www.trismegistos.org/fayum. 3 February 5, 2018. 4 http://papyri.info. ANCIENT PROFILES EXPLOITED 13

Since none of the existing NER applications were geared toward the spe- cifics of ancient Greek, a custom procedure in three phases was set up to extract and interpret the names:5 first, an onomastic gazetteer was compiled; this was then matched with the full text of the DDbDP to tag individual names; and finally genealogical and prosopographical identifications were carried out.

Fig. 1 Different levels of the onomastic gazetteer

For the onomastic gazetteer, Trismegistos could fall back on the collec- tion of names in the Prosopographia Ptolemaica, expanding this through the extraction of personal names from all capitalized words in the DDbDP. Since Greek is a case language, the gazetteer not only included the standard, nominative forms of the names, but all possible inflected forms of each pos- sible variant, even if these are unattested or even only theoretically possible (see Fig. 1).6 This list was then matched to the DDbDP: if a word appeared in the gazetteer, it was tagged as a recognized personal name, or, if a form could refer to two names, as an ambiguous personal name.7 These tags were then checked by humans to resolve the ambiguous forms and to filter out

5 For a detailed description of the procedure, see B. Van Beek & M. Depauw, ‘People in Greek documentary papyri. First results of a research project’, JJurP 39 (2009), pp. 31–47. 6 This gazetteer can also be consulted online at www.trismegistos.org/nam/search.php. 7 Place names were also tagged in the process, and are available in TM Geo (www.trismegistos.org/geo). 14 YANNE BROUX mistakes (add unrecognized names, e.g. those of which the beginning is lost, and remove residue such as divinities or names of emperors, etc.).

Table 1. Excerpt of the original NER rulebook Sequence Interpretation nominative person accusative – ὃς καί – nominative – person – double καλεῖται dative – genitive person – father nominative – genitive – τοῦ – genitive – person – father – grandfather – mother μητρός – genitive

In the final phase, strings of consecutive names, called ‘identification clusters’, were interpreted to distinguish individuals in longer strings and to establish genealogical relationships. In antiquity, a person was commonly identified by a combination of his given name and his patronymic. In Egypt, during the first three centuries AD the metronymic and even papponymic was often added as well,8 as in Ἰσχυρίων Ἥρωνος τοῦ Φιλάμμωνος μη(τρὸς) Ἡρακλοῦτο(ς). A rulebook was therefore compiled to teach the computer how to cope with these strings (Table 1). It takes into account the specific cases of each name, as genealogical identifiers are generally appended in the genitive. It also includes introductory terms, which would otherwise break up the identification cluster; these are mainly words that mark filiationτοῦ ( , μητρός, ἀπάτωρ, …) or double names (ὁ καί, ἐπικαλούμενος, …). More than 150 rules were defined, and matched with the names to automate the as- signment of the names in the cluster to specific individuals (person, father, mother, .). The results of this markup were again checked by humans to intercept irregularities. Working text by text also made it possible to identify individuals recurring in a single text. In the end, some 350,000 attestations of names were added to TM People from the Greek papyrological material. Names occurring in Egyptian texts are still being entered manually, as no comprehensive digital repositories

8 Y. Broux & M. Depauw, ‘The maternal line in Greek identification. Signalling social status in Roman Egypt’, Historia 64 (2015), pp. 467–478. ANCIENT PROFILES EXPLOITED 15 are available. As a result, TM People now counts 497,296 attestations.9 It has proven to be a fruitful base for research, bringing forth many studies that started from this data set, for example on double names,10 Christian names,11 hybrid names,12 and children named after Hellenistic queens.13

Expanding to Latin inscriptions

Since 2010, Trismegistos has been expanding its horizons by abandoning its initial geographical limitations. The scope is now widened to include the entire ancient Mediterranean (apart from cuneiform). As one of the partners in the Europeana network of Ancient Greek and Latin Epigraphy (EAGLE) consortium,14 Trismegistos was able to incorporate all Latin inscriptions pro- vided in EAGLE’s partner databases, including the Epigraphische Datenbank Clauss-Slaby (EDCS), in all some 350,000 texts (2013–2016). Smaller corpora of indigenous languages have been added manually (e.g. Etruscan on the basis of Meiser’s editions15), or will be added through new partnerships with other projects (e.g. DASI for South-Arabian16). The final hurdle will be the Greek inscriptions with the help of the ‘Integrating Digital Epigraphies’ (IDEs) collective, which brings together the main players in the field: PHI, SEG, and CLAROS.17

9 February 5, 2018. 10 S. Coussement, ‘Because I am Greek’: Polyonymy as an Expression of Ethnicity in Ptole- maic Egypt [= Studia Hellenistica LV], Leuven 2016; Y. Broux, Double Names and Elite Strategy in Roman Egypt [= Studia Hellenistica LIV], Leuven 2015. 11 M. Depauw & W. Clarysse, ‘How Christian was fourth-century Egypt? Onomastic perspectives on conversion’, Vigiliae Christianae 67 (2013), pp. 407–435. 12 N. Dogaer & M. Depauw, ‘Horion & Co. Greek hybrid names and their value for the study of intercultural contacts in Graeco-Roman Egypt’, Historia 66 (2017), pp. 193–215. 13 Y. Broux & W. Clarysse, ‘Would you name your child after a celebrity? Arsinoe, Bereni- ke, Kleopatra, Laodike and Stratonike in the Greco-Roman East’, ZPE 200 (2016), pp. 347–362. 14 www.eagle-network.eu. 15 G. Meiser, Etruskische Texte. Editio minor [= Studien zur historisch-vergleichenden Sprach-​ wissenschaft IV], Hamburg 2014. 16 http://dasi.humnet.unipi.it. 17 IDEs: http://blogs.library.duke.edu/dcthree; PHI: http://epigraphy.packhum.org/ inscriptions; SEG: http://hum.leiden.edu/history/research/projects-umw/seg.html; CLAROS: www.dge.filol.csic.es/claros/cnc/2cnc.htm. 16 YANNE BROUX

Once the metadata of the Latin inscriptions was integrated, the road was paved to upgrade TM People as well. The full texts of these sources is avail- able through the EDCS,18 so a similar procedure as the one devised for the Greek papyrological material could be used to extract the relevant informa- tion. Several adjustments were made to cater to Roman nomenclature and to simplify and speed up the process. Since Latin names are also attested in Egypt, Trismegistos already dis- posed of an extensive list (280,200 inflected forms). The rulebook was to some extent already familiar with the Roman naming system and its unique use of family names and its shifting conventions regarding individuating names.19 It needed to be fine-tuned, however, to cope with longer identifica- tion clusters consisting of several components referring to the same individ- ual, such as Pescennius Cai filius Papiria Victor Severianus. New rules were therefore drafted for the duo and tria nomina, also taking into account the standards for filiation and thetribus (Table 2).

Table 2. Excerpt of the updated NER rulebook Sequence Interpretation accusative – accusative – accusative person nominative – nominative – genitive – person – father filius – nominative dative – dative – tribus – dative person – tribe ablative – ablative – genitive – libertus – person – former master ablative

Whereas the first time around the different steps of the NER procedure were tackled in different systems (e.g. the gazetteer and rulebook in Filemak- er, the human quality control in an online PHP/MySQL environment), this time everything is incorporated in a single Filemaker structure (apart from the gazetteer, which is part of the original TM People database). This way,

18 www.manfredclauss.de. 19 See B. Salway, ‘What’s in a name? A survey of Roman onomastic practice from c. 700 BC to AD 700’, JRS 84 (1994), pp. 124–145, for a concise overview of the evolution of Roman naming traditions. ANCIENT PROFILES EXPLOITED 17 more steps could be automated without external help, generating quicker results. 907,064 clusters of capitalized words were extracted from EDCS, of which 433,156 contain at least one personal name. At the time of writing (January 2017), a little over 70% of these identification clusters have been checked to establish whether the cases are interpreted correctly, and to add unrecog- nized names or remove false positives.20 Each time a cluster is ‘approved’, it is added to a linked database where it is automatically matched to the trans- formed and updated rulebook (the Filemaker version of which now contains 623 rules). The next step will again be to check whether the rules have been applied correctly. This does not only involve interpreting the genealogical compo- sition of the cluster, however. Given the structure of a Roman name, with a , a gentilicium, and a cognomen, an extra step has to be built in to make sure that each of these components is tagged correctly. Since common praenomina and gentilicia are marked as such in the gazetteer, this can be automated as well. Finally, individuals mentioned more than once within a single text will also be identified during this stage (although this is not as common as in papyri).

First results: the case of the Spurii

I would like to conclude this paper with some preliminary results re- garding the Spurii/spurii, since they appear in several other chapters of this book as well. Spurius was originally a personal name, and was used as both a praenomen and as a gentilicium (see below). However, due to some confusion regarding abbreviations in inscriptions (initially the ab- breviation SPF stood for Spuri filius, denoting actual filiation, but later on it was also used for sine patre filius), the designation spuri filius be- came associated with illegitimacy.21 I will not elaborate on the legal or

20 The entire eastern Mediterranean, North Africa, the Spanish peninsula, the Danube region, Italy, and Rome have thus been checked so far. 21 O. Salomies, Die römischen Vornamen. Studien zur römischen Namengebung [= Com- mentationes Humanarum LitterarumLXXXII], Helsinki 1987, p. 51. 18 YANNE BROUX social status of these children here, as this is discussed in more detail elsewhere.22 I will rather present an updated overview of the available data, since it has become much easier and faster to quantify and present (relatively) large bodies of evidence with the help of digital methods and tools. Previously, even a limited dataset such as this one would have tak- en months to collect and annotate. Now, this can be managed in a week, even when counting the final phase of the NER process, which had to be performed before the data was ready for analysis.

Fig. 2. Spurius vs spuri filius

In all, 794 references to people named Spurius or with the designa- tion spuri filius were filtered from EDCS (Fig. 2).23 Although this paper focuses on the NER results of the Latin inscriptions, I have (manually) added the attestations of the Spurii/spurii found in Greek inscriptions to

22 See in this volume M. Krawczyk, ‘Paternal onomastical legacy vs. illegitimacy in Ro- man epitaphs’, pp. 107–128, and M. Nowak, ‘Get your free corn: The fatherless in the corn- dole archive from Oxyrhynchos’, pp. 215–227. See also Salomies, Die römischen Vornamen (cit. n. 21), pp. 51–55; B. Rawson, ‘Spurii and the Roman view of illegitimacy’, Antichthon 23 (1989), pp. 10–41; M. Nowak, ‘Ways of describing illegitimate children vs. their legal situa- tion’, ZPE 193 (2005), pp. 207–218, for further discussions. 23 In EDCS, all instances of Spurius are capitalized, even if the ‘filiation’ obviously refers to fatherlessness, which was actually rather fortunate, since they were thus incorporated automatically during the NER process. ANCIENT PROFILES EXPLOITED 19 this dataset as well (59 attestations). While Greek had its own traditions to indicate fatherlessness,24 in some contexts, the Latin description was preferred. Taken together, there are thus 174 individuals who bear the name Spurius, and 624 people who are styled as spuri filius.25

The name Spurius

The name Spurius was originally used both as a praenomen and as a gentilicium, but each is of different origins. The former is Etruscan, while the latter is probably Oscan.26 Figure 3 shows the breakdown of the num- ber of people and attestations:

Fig. 3. Spurius as praenomen vs gentilicium vs cognomen

24 See e.g. D. Ogden, Greek Bastardy in the Classical and the Hellenistic Periods, Oxford 1996; S. R. Huebner & D. M. Ratzan (eds.), Growing up Fatherless in Antiquity, Cambridge 2009; Nowak, ‘Ways of describing illegitimate children’ (cit. n. 22), p. 207ff. 25 The data for this section can be downloaded at www.trismegistos.org/tmcorpusda- ta/6. When referring to a Latin inscription, I use the TM number, the stable identifi- er provided for each text (for more information: www.trismegistos.org/about_iden- tifiers and www.trismegistos.org/about_how_to_cite). This can be used to search for the text through the homepage, or in the stable URI www.trismegistos.org/text/*fill in TM number here*. The PHI number used for the Greek inscriptions can be used in the URI http://epigraphy.packhum.org/text/*fill in PHI number here*. 26 Salomies, Die römischen Vornamen (cit. n. 21), pp. 51 and 160. 20 YANNE BROUX

52 people bear Spurius as a praenomen. Since women lacked praenomina, these are all men. With 105 individuals, the gentilicium is more common; 81 are men, 24 are women. Finally, the name is attested as acognomen nine times (2 women).27 Figure 4 gives an overview of the chronological spread of the name Spurius in its different positions praenomen( , gentilicium, and cognomen). Only those people with a date range spanning less than 200 years have been included to minimize distortion. These are still raw numbers, since not all provinces have been processed yet (see above), and they are, un- fortunately, very low: only 28 dated individuals with a praenomen,28 36 with a gentilicium, and 5 with a cognomen (which, given the total of only nine examples, is actually high in comparison to the other two!).29 To get some perspective concerning the popularity of the name, the total num- ber of Latin inscriptions for each period, representing the so-called ep- igraphic habit, has therefore been added (the dotted grey line). To ensure an even distribution, the graph is ‘weighed’, i.e. if a person is attested in multiple years (which is generally the case as inscriptions are seldom dated to an exact year), he or she is ‘divided’ over the respective years.30

27 Spurius also appears to be used twice as a single name: in CIL VI.3 21200 (TM 584103, AD 1–50) and in ILBelgique (2nd ed.) 159 ter (TM 209482, AD 200–299), but perhaps the con- text just did not call for the full duo or tria nomina here. 28 EDCS contains six examples of people in inscriptions dated to the Imperial period with a praenomen abbreviated as ‘S’, supplemented as S(purius): S. Q. libertus in CIL V.1 78 (TM 555675, AD 1–99), S. Romanus in CIL XVI 128 (TM 411636, AD 178), S. Egnatius Paulus in CIL XVI 133 (TM 212948, AD 192), S. Baebius Fortunatus in Inscriptiones Italiae III.1 178 (TM 247393, AD 175–250), and S. Teius Ligdamus and his son S. Teius Arelonius in Inscriptiones Italiae III.1 286 (TM 247354, AD 175–299). However, in this period, this can mean nothing but S(extus). They have therefore not been included in the data set. 29 Trismegistos is working on narrowing down the dates of inscriptions when possible, but this is obviously very time-consuming, since this is manual work. 30 E.g. Spurius Rutilius is attested in an inscription from Delos dated to 144–143 BC (ID 2616 = PHI 65072), so he counts as 0.5 in each year; B. Van Beek & M. Depauw, ‘Quanti- fying imprecisely dated sources. A new inclusive method for charting diachronic change in Graeco-Roman Egypt’, AncSoc 43 (2013), pp. 101–114. ANCIENT PROFILES EXPLOITED 21 Fig. 4. Chronological evolution of the name Spurius evolution Fig. 4. Chronological 22 YANNE BROUX

The discrepancy between the Republican and Imperial periods is strik- ing in this graph. The results for the latter are in line with expectations, as both types more or less follow the same curve as the number of texts. These fluctuations are therefore the results of epigraphic traditions, and thus say little about the name’s popularity at this point. During the Republican era, however, Spurius is used almost exclusively as a praenomen. The fourth and third centuries BC yield few results. This is partially due to the fact that only inscriptions are taken into consideration here. But as Rawson also pointed out on the basis of Broughton’s Magistrates of the Roman Republic, the name was rare tout court.31 The period 150–76 BC shows some exceptional peaks that cannot be explained by the epigraphical habit. It turns out that most of these people are actually attested in Greek inscriptions from the eastern Mediterranean. What this graph therefore reveals is not a sudden increase in the number of Spurii, but rather that Greek inscriptions are more accurately dated than Latin ones.32 Figure 5 presents the geographical spread of the name Spurius. The re- sults are grouped together by region (Italy, Asia Minor, and the Greek is- lands) or by province (see Table 3 for an overview of all regions). Rome is marked separately to clearly show the distinction with the rest of Latium/ Campania; the same goes for Delos and the rest of the Greek islands, as Delos yielded more results than the rest of the eastern Mediterranean all to- gether. Praenomina are represented by black circles, gentilicia by dark gray, and cognomina by light gray. If more than one type occurs in a region, the circles are stacked from dark (bottom) to light (top). In Delos (no. 35), for ex- ample, where a smaller dark gray circle sits on top a larger black circle, there are more attestations of Spurius as praenomen than as cognomen. In Etruria (no. 7), it is the other way around (see the legend in the top left corner for the

31 Rawson, ‘Spurii and the Roman view of illegitimacy’ (cit. n. 22), pp. 29–30; T. R. S. Broughton, The Magistrates of the Roman Republic, 3 vols., New York – Atlanta, pp. 1951–1986. 32 The exceptional peak of the gentilicium Spurius in 125–101 BC is the result of a single funerary monument mentioning a Cnaeus Spurius and two Marcii Spurii, perhaps his sons (CIL I(2).2.4 3130 = TM 250591, 125–100 BC). ANCIENT PROFILES EXPLOITED 23 scale). The total number of Latin inscriptions attested in each region is added in Table 3 to provide some perspective. The name Spurius is heavily concentrated in the Italic regions, especially in Latium/Campania (no. 1), followed by Venetia/Histria (no. 10). This is to be expected given the large number of inscriptions from these regions, but perhaps in Latium/Campania the high number of gentilicia is a remnant of Oscan influence? Transpadana (no. 11) is third in line, and this is a region with relatively few texts, so the name (especially as a gentilicium) did seem to gain some popularity there. Rome (no. 0), on the other hand, with its dis- proportionately large corpus, is usually well-represented, but in this case, there are very few examples. Despite the fact that the praenomen originated in Etruria (no. 7), only one person is attested there; the gentilicium actual- ly occurs more often. In the western provinces, only Narbonensis (no. 17) stands out, but again, this is a region with a lot of texts. Finally, in the East, most examples are concentrated on Delos. As a cult center, the island attract- ed devotees from far and wide, including Romans, who commemorated their visits in stone. For 19 of the men with the praenomen Spurius (part of) the father’s name is known: 10 of them are Spuri filii (which in this case it obviously has noth- ing to do with illegitimacy), but since there is no information about their sib- lings, it is impossible to draw conclusions about birth order (as it is generally assumed that the firstborn son inherited his father’spraenomen ). Among those with Spurius as a gentilicium, only 20 mention their fa- ther. In most cases where this is simply the praenomen, it is safe to assume that the gentilicium Spurius was inherited from the father as well. However, sometimes more detailed genealogical information can be deduced from the text, which shows that things were not so straightforward. What with Spuria Veneria, daughter of Venerius,33 and Spuria Firmiana, daughter of Mettelus Restutus,34 for example? Spuria Veneria’s cognomen (or double gentilicium, if you prefer) is clearly derived from the father (a freedman), but where does the Spuria come from in both cases? Did they inherit it from their mothers?

33 CIL Suppl. Italica 800 (TM 502672). 34 CIL XI.1 1025 (TM 518063). 24 YANNE BROUX

The epitaph of Spuria (?) is especially intriguing.35 The dedication reads:

S]pu/ria vi/ta(m) fini/vit r(elictis) / miser/rimis / patre / n(aturale) suo / et Aelia / Victorina / matre / qui hoc / pii d(ederunt) f(ecerunt)

Spuria (?) ended her life, leaving behind in misery her natural father and her mother Aelia Victorina, who have paid for and erected this piously.

It was erected by both her parents, but only her mother is mentioned by name (Aelia Victorina). Her father is simply styled as ‘her natural father’ (patre naturale suo). It is clear from the text that she had known her father, that he had been involved in her life. He is acknowledged, yet not mentioned by name. Either he did not want to be known, or his identity was not impor- tant here. Whatever the reason, it is clear that, officially, he had no connec- tion with this girl.36 Finally, Caius Afinius Spurius is the only person with Spurius as acogno - men who is mentioned with his filiation:Spuri filius.37 The first letter of each line is broken off, so the ‘S’ of both Spurius names is lost. There is little doubt about the reconstruction of the paternal praenomen, as there are no others that end in ‘-purius’. Thecognomen is trickier, however, since there was more freedom in choosing this name. If it is Spurius, then either this person adopt- ed his father’s praenomen as his own cognomen, which is very rare,38 or this is a spuri filius with a cognomen used as an extra status designation. Although scarce, examples like these remind us to be cautious and not to take the name Spurius at face value. Just like Aurelius became more of a status marker than an actual name after AD 212, perhaps Spurius was also used as a pseudo-name in some cases.

35 ILAfrique 173 (TM 364149). 36 Another possibility is that the word spuria in this epitaph should not be read as a name, but rather as an adjective or noun. I find it rather odd, however, that the deceased would then nowhere be mentioned by name. 37 CIL IX 2696 (TM 551583, AD 1–99). 38 Salomies, Die römischen Vornamen (cit. n. 21), pp. 164–165, who does not list Spurius as an example. ANCIENT PROFILES EXPLOITED 25

39 39 Fig. 5. Geographical spread of the name Spurius Fig. 5. Geographical spread

39 The maps in Figures 5 and 6 were created in Palladio (http://hdlab.stanford.edu/palladio). 26 YANNE BROUX spuri filii Fig. 6. Geographical spread of the Fig. 6. Geographical spread ANCIENT PROFILES EXPLOITED 27

Table 3. Regions in the Roman Empire4041 0 Rome (n = 120,760) 23 Numidia (n = 16,564) 1 Latium et Campania – Regio I 24 Africa Proconsularis (n = 42,397) (n = 32,958) 2 Apulia et Calabria – Regio II 25 Sardinia (n = 2,417) (n = 6,558) 3 Bruttium et Lucania – Regio III 26 Sicilia (n = 6,113) (n = 2,609) 4 Samnium – Regio IV (n = 6,632) 27 Aegyptus (n = 26,904)40 5 Picenum – Regio V (n = 2,587) 28 Syria (n = 2,642) 6 Umbria – Regio VI (n = 5,490) 29 Galatia (n = 811) 7 Etruria – Regio VII (n = 19,874) 30 Bithynia et Pontus (n = 434) 8 Aemilia – Regio VIII (n = 5,008) 31 Mysia (n = 43) 9 Liguria – Regio IX (n = 1,797) 32 Lydia (n = 116) 10 Venetia et Histria – Regio X 33 Ionia (n = 524) (n = 17,138) 11 Transpadana – Regio XI 34 Caria (n = 218) (n = 4,598) 12 Germania Inferior (n = 11,277) 35 Delos (n = 291) 13 Germania Superior (n = 21,055) 36 Greek islands (n = 444)41 14 Belgica (n = 17,667) 37 Achaia (n = 3,085) 15 Lugdunensis (n = 15,613) 38 Macedonia (n = 3,578) 16 Alpes Maritimae (n = 866) 39 Thracia (n = 585) 17 Narbonensis (n = 21,509) 40 Moesia Inferior (n = 3,644) 18 Aquitania (n = 16,794) 41 Dalmatia (n = 9,751) 19 Hispania Citerior (n = 19,742) 42 Pannonia Inferior (n = 4,289) 20 Lusitania (n = 7,279) 43 Pannonia Superior (n = 6,605) 21 Baetica (n = 7,135) 44 Noricum (n = 4,327) 22 Mauretania (n = 6,570) 45 Barbaricum (outside Roman Empire) (n = 12,773)

40 For Egypt, only inscriptions (i.e. material = stone or metal) are counted, but these also include those in Greek, as they were incorporated in Trismegistos from the very beginning. 41 Examples of the name Spurius and spuri filius were found on Crete, Kos, Lesbos, Naxos, and Samothrace, so the total includes inscriptions from these islands only. 28 YANNE BROUX

The designation spuri filius

Back in 1989, Rawson counted 184 spuri filii (it is not clear though if she meant attestations or individuals).42 With the help of NER, 612 individuals could now be extracted from the inscriptions (see Fig. 2):43 367 men, versus 245 women.44 Figure 7 shows the chronological evolution of the designation spuri fil- ius.45 The handful of examples from the third and second centuries BC are mainly women (four out of six). Three are from Italy and bear a single name (the name of their father’s gens), as was custom at the time. The fourth ex- ample, however, from Delos, is Tertia Stertinia Alexandra.46 A woman with tria nomina this early on is very rare. If you add her Greek cognomen to her status as spuri filia, her tria nomina seem to reflect a mixed background (a Roman mother and a Greek father).47 Most examples of spuri filius are concentrated in the first century AD. Their figures start rising at the same time the number of Latin inscriptions goes up (by the end of the first centu- ry BC), peak during the first quarter of the next century, and follow the same declining trend during the next 50 years. It is tempting to see a connection with the marriage laws issued under Augustus here. While the total number of inscriptions stabilizes in the last quarter of the century, however, the spuri filii continue to drop, leaving only a handful of examples by the beginning of the third century. A map showing the geographical distribution of the spuri filii presents the same discrepancy between Italy and the provinces (Fig. 6). This time, most individuals are attested in Rome (no. 0), however. The elevated figures

42 Rawson, ‘Spurii and the Roman view of illegitimacy’ (cit. n. 22), p. 29. 43 These do not include those individuals styled asSpuri filius who bore the name Spurius themselves, since in these cases we are most likely dealing with actual filiation (see the sec- tion on ‘The name Spurius’). 44 With 60% vs 40%, this is not as evenly balanced also Rawson’s data set, which was 52% vs 48%: Rawson, ‘Spurii and the Roman view of illegitimacy’ (cit. n. 22), p. 31. 45 Again, only those people dated to less than 200 years are included (268 exx.). 46 EAD XXX 161 (PHI 215194, 125–100 BC). 47 M.-T. Le Dinahet, ‘Les Italiens de Délos: compléments onomastiques et prosopo- graphiques’, REA 103 (2001), pp. 103–123, esp. p. 113. ANCIENT PROFILES EXPLOITED 29

The designation spuri filius

Back in 1989, Rawson counted 184 spuri filii (it is not clear though if she meant attestations or individuals).42 With the help of NER, 612 individuals could now be extracted from the inscriptions (see Fig. 2):43 367 men, versus 245 women.44 Figure 7 shows the chronological evolution of the designation spuri fil- ius.45 The handful of examples from the third and second centuries BC are mainly women (four out of six). Three are from Italy and bear a single name (the name of their father’s gens), as was custom at the time. The fourth ex- ample, however, from Delos, is Tertia Stertinia Alexandra.46 A woman with tria nomina this early on is very rare. If you add her Greek cognomen to her status as spuri filia, her tria nomina seem to reflect a mixed background (a Roman mother and a Greek father).47 Most examples of spuri filius are concentrated in the first century AD. Their figures start rising at the same time the number of Latin inscriptions goes up (by the end of the first centu- ry BC), peak during the first quarter of the next century, and follow the same declining trend during the next 50 years. It is tempting to see a connection with the marriage laws issued under Augustus here. While the total number of inscriptions stabilizes in the last quarter of the century, however, the spuri filii continue to drop, leaving only a handful of examples by the beginning of the third century. A map showing the geographical distribution of the spuri filii presents the same discrepancy between Italy and the provinces (Fig. 6). This time, most individuals are attested in Rome (no. 0), however. The elevated figures spuri filii

42 Rawson, ‘Spurii and the Roman view of illegitimacy’ (cit. n. 22), p. 29. 43 These do not include those individuals styled asSpuri filius who bore the name Spurius themselves, since in these cases we are most likely dealing with actual filiation (see the sec- tion on ‘The name Spurius’). 44 With 60% vs 40%, this is not as evenly balanced also Rawson’s data set, which was 52% vs 48%: Rawson, ‘Spurii and the Roman view of illegitimacy’ (cit. n. 22), p. 31. 45 Again, only those people dated to less than 200 years are included (268 exx.). 46 EAD XXX 161 (PHI 215194, 125–100 BC). 47 M.-T. Le Dinahet, ‘Les Italiens de Délos: compléments onomastiques et prosopo- of the evolution Fig. 7. Chronological graphiques’, REA 103 (2001), pp. 103–123, esp. p. 113. 30 YANNE BROUX in Latium/Campania (no. 1) and Venetia/Histria (no. 10) are again proba- bly the result of the larger text corpora from these regions. In Samnium (no. 4) and Aemilia (no. 8) the phenomenon seems to have been more fre- quent when taking the low number of texts into account. Overall, numbers are low throughout the provinces. In the East, this is to be expected, as they had their own traditions to designate illegitimate children there (e.g. νόθος or ἀπάτωρ).48 In the West as well, examples are rare. One would expect more mixed marriages, and therefore more illegitimate offspring, in these areas. Perhaps endogamy was practiced more strictly since citizenship was scarcer? Or maybe illegitimacy was not emphasized in local cultures, and this had an influence on nomenclature? Given their status as fatherless children, the maternal line of these spuri filii is mentioned more often than the paternal line. For 51 people, only the mother is known with certainty; 38 of them bear the same gentilicium as her. Sometimes the father is mentioned by name, however. This is gener- ally in combination with the mother (23 exx.). It is striking that many of these individuals inherited their gentilicium (some even their whole name) from their father (8 exx.); in no way does their name refer to their mother. In some cases, both parents have the same gentilicium (7 exx.). These were either both liberti of the same household, or the mother was a freedwoman of the father. This is in stark contrast to Rawson’s observation that while the mother was always free, ‘almost always the father is a slave’.49 Only six people of whom both parents are known have no onomastic ties with their father, only with their mother. Finally, there are only eight individuals with a known father and an unknown mother, of which five show clear ties with their father’s names. There are several examples, however, where the context hints at a fa- ther, but the relationship cannot be established with certainty, e.g. Lucius Cocceius Salvius, a spuri filius, who dedicates an inscription for his mother,

48 There are 24 spuri filii in Greek inscriptions, which is transcribed as Σπορίου υἱός (θυγάτηρ), although the filiation marker is just as often left out. Since these are all very short inscriptions, it is impossible to ascertain whether those may have been children of actual Spurii. 49 Rawson, ‘Spurii and the Roman view of illegitimacy’ (cit. n. 22), p. 32. ANCIENT PROFILES EXPLOITED 31

Vidia Musa, together with Lucius Cocceius Primigenius, his mother’s hus- band.50 Seeing that both men bear the same praenomen and gentilicium, it is very likely that they were actually father and son, but this is not specified explicitly in the text.51 In a similar case a certain Vescinia Eleutheris (notice the Greek cognomen referring to freedom!) erected an epitaph in honor of Vescinius Rufus, son of Titus, her patron, and Titus Vescinius Rufus, spuri filius.52 Vescinia and her patron are not named as the boy’s parents, yet it seems very likely. Other examples include Rubbius Restitutus, whose father Marcus Rubbius Minius is a freedman,53 and Numisius Quadratus, son of Sextus Numisius, also a freedman.54 It is thus safe to say that in many cases, the father was known and played an active part in his child’s life; due to legal circumstances, however, he could not officially be acknowledged. Although there are people of different walks of life among thespuri filii, the upper classes are not represented. In a society where status was so depend- ent on genealogy, illegitimacy was not something the Roman elite would have wanted to flaunt. Twenty-two people have titles referring to politi- cal, religious or economic activities appended to their identification. Most of them belonged to the local elite, such as […] Plaetorius from Thermae Himeraeae (Sicily), styled as quaestor aedilis duovir.55 Others were priests of indigenous (Mater Matuta and Bona Dea) or imported deities (Sarapis and Cybele). Nineteen men were in the military; six add an occupation (mainly craftsmen); and two are styled as patrons. Many spuri filii are mentioned together with freedmen (153 exx.).56 In most cases, no exact relationship can be determined (100 exx.). Where this

50 CIL VI.4.2 36550 (TM 596929). 51 Examples such as these are not included in the previous paragraph. In case of such uncertain genealogical relationships, the name of the relative is followed by a question mark in the online data set. 52 CIL X.1 4398 (TM 252630, AD 100–250). 53 CIL VI.4.1 25498 (TM 274430, AD 1–99). 54 Supplementa Italica 9 [pp. 11–209] 118 (TM 283028, 99–1 BC). 55 Année épigraphique 1976 265 (TM 175712, AD 50–150). 56 In the data set, people attested together with freedmen are marked with the term liberti in the ‘Context’ column. The nature of the relationship is defined where possible. If it is uncertain (38 exx.), this is indicated by a question mark. If it is unknown (62 exx.), it is stated as such. 32 YANNE BROUX is possible, the spuri filii are generally defined as family members (children, parents or spouses), or, in rare cases, as former masters of the freedmen. This preponderance of liberti in the inscriptions related to spuri filii is an impor- tant indication of the milieu these people moved in. The NER parsing of the personal names in the Latin inscriptions will result in an estimated 500,000 attestations, which will double the amount currently available in TM People. This collection will open up many pos- sibilities, not only for onomastic and prosopographic research, but also for studies on cultural identity, religion, and historical linguistics, for example. With such a large dataset, it is crucial to work together with partners to optimize the available information. Trismegistos will therefore continue to work together with initiatives such as EAGLE57 and Standards for Network- ing Ancient Prosopographies: Data and Relations in Greco-Roman Names [SNAP:DRGN]58 to provide links between existing projects. The team is, moreover, experimenting with the implementation of online tools such as customizable maps and network visualizations so that the entire scholarly community can eventually benefit from the data in as many possible ways.

Yanne Broux KU Leuven Blijde Inkomststraat 21 bus 3307 3000 Leuven Belgium [email protected]

57 Cit. n. 14. 58 http://snapdrgn.net. ANCIENT PROFILES EXPLOITED 33

Ancient profiles exploited First results of Named Entity Recognition applied to Latin inscriptions Abstract This paper focuses on the extraction of all personal names on toponyms from Latin in- scriptions with the help of Named Entity Recognition [NER] by Trismegistos (www.trisme- gistos.org). It starts with a brief history, discussing the method and how it was applied to the Greek documentary papyri. The second section explains how the procedure was adapted to enhance performance and, most importantly, to accommodate Roman nomenclature. Finally, a short case study of the name Spurius and the designation spuri filius presents some prelimi- nary results and illustrates the possibilities this new corpus of names opens up for research.

Keywords: Latin inscriptions, Digital Humanities, Named Entity Recognition, onomastics, Spurius

Wykorzystanie starożytnych profili. Pierwsze wyniki zastosowania Named Entity Recognition do łacińskich inskrypcji. Abstrakt Artykuł dotyczy pozyskania wszystkich imion własnych z łacińskich inskrypcji za pomocą Named Entity Recogition (NER) przez Trismegistosa (www.trismegistos.org). Początek arty- kułu stanowi wprowadzenie metodologiczne i omówienie, jak zaprezentowaną metodę sto- sowano do greckich dokumentów papirusowych. Druga część tekstu wyjaśnia, w jaki sposób metoda została ulepszona, a przede wszystkim, dostosowana do rzymskiej nomenklatury. Na koniec za pomocą krótkiego studium przypadku dotyczącego imienia Spurius oraz wyrażenia spuri filius zaprezentowano wstępne wyniki oraz możliwości, jakie ten nowy zbiór imion daje uczonym.

Słowa kluczowe: inskrypcje łacińskie, cyfrowa humanistyka, Named Entry Recognition, onomastyka, Spurius