The English Italian Translational Corpus: a resource for learning about translation Federico Zanettin (Bologna, Italy)

This paper describes the first phase of the CEXI project at the SSLMIT, University of Bologna in Forlì, involving the selection of the texts to be included in the corpus and decisions about the processing of these texts (i.e. the transition from “natural language production” to “electronic corpus”). The aim of the project is to create a resource which can be used by both students and researchers to learn about translation and translating. The English Italian Translational Corpus is planned as a four-million-word corpus subdivided into four components of one million words each, and can be described as a bi-directional parallel corpus of 10 to 15 thousand word samples from original Italian texts and their translations into English and viceversa, all published between 1945 and 2000. Each of the four components consists of equal proportions of fictional and non-fictional texts: to ensure some variability within these two broad domains it was decided to include about forty samples for each subcomponent. Once this minimal requirement is met, it is hoped to expand the corpus in various directions: for instance, to include full texts as well as samples and different translations of the same texts. The aim is to increase diversity and completeness at the same time. In principle, a reciprocal (Teubert 1996) or bi-directional parallel (Aston 1999) corpus can permit comparisons of original texts and their translations in both directions as well as of originals and translations in the same language respectively, along the lines suggested by similar projects, notably the English Norwegian Parallel Corpus (Johansson 1998). These types of analyses can cast light on interlingual similarities and differences and on the translation process, both in specific directions and as a more general phenomenon. In practice the creation of a corpus involves a series of compromises between what is ideally desirable and what is feasible given practical and theoretical limitations. Any parallel corpus is by necessity “translation driven,” given that only texts which have been translated can be included in it. A review of bibliographical sources shows that full reciprocity is not possible, given the differences in text production policies in different countries. For instance, while about one out of four books published in Italy is a translation – and half the translations are from English (Vigini 1999) – , the quota of translated books in UK and the US is about one out of fifty (Venuti 1995), and only a minor percentage of them are from Italian. Furthermore, the difference is not only one of how much is translated, but also of what is translated. For fiction, for instance, it seems that translations from Italian into English are mainly of high status (e.g. Calvino and Eco) while most of what is translated from English into Italian is popular fiction (science-fiction, romance and detective stories). The project aims to construct a corpus representing the intersection of what is translated in the two languages (and whose components can therefore be treated as comparable) rather than what is translated in each single direction. We hope subsequently to flank this “core” corpus with additional unidirectional parallel collections which will better reflect the specific characteristics of these two very different translation populations. Thereby, we hope to provide a collection of corpus tools for the study and the practice of translation in both directions (cf. Zanettin 1998).

Bibliography Aston, Guy. 1999. “Corpus Use and Learning to Translate.” In Bassnett, Susan/Bollettieri Bosinelli, Rosa Maria/Ulrych, Margherita. Translation Studies Revisited, Textus XII:2. Genova: Tilgher. 289-314. Johansson, Stig. 1998. “On the Role of Corpora in Cross-Linguistic Research.” In Johansson, Stig/Oksefjell, S. (eds.) Corpora and Cross-Linguistic Research. Amsterdam: Rodopi. 3- 24. Teubert, Wolfgang. 1996. “Comparable or Parallel Corpora?” International Journal of Lexicography, 9:3. 238-264. Venuti, Laurence. 1995. The Translator’s Invisibility. London/New York: Routledge. Vigini, Giuliano. 1999. Rapporto sull'editoria italiana. Milano: Editrice Bibliografica. Zanettin, Federico. 1998. “Bilingual Comparable Corpora and the Training of Translators.” META 43:4. 616-630.