Ref. Ares(2015)4205268 - 09/10/2015

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN

Acronym: LT_OBSERVATORY

COORDINATION AND SUPPORT ACTION INFORMATION AND COMMUNICATION TECHNOLOGIES

D3.1 List of National and Regional Strategies

GRANT AGREEMENT 644583 DELIVERABLE NUMBER D3.1 DELIVERABLE TITLE List of national and regional strategies DUE DATE OF DELIVERABLE 31/08/2015 ACTUAL SUBMISSION DATE 09/10 2015 START DATE OF THE PROJECT 01/01/2015 DURATION 24 M ORGANIZATION NAME RESPONSIBLE EMF FOR THIS DELIVERABLE

PROJECT CO-FUNDED BY WITHIN THE SEVENTH FRAMEWORK PROGRAMME DISSEMINATION LEVEL PU Public ☒ PP Restricted to other programme participants (including the Commission Services) ☐ RE Restricted to a group specified by the consortium (including the Commission Services) ☐

CO Confidential, only for members of the consortium (including the Commission Services) ☐

TYPE Document, report ☒

DEM Demonstrator, pilot, prototype ☐

DEC Websites, patent fillings, prototype ☐ OTHER ☐

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

TABLE OF CONTENTS

DOCUMENT INFO ...... 5 1. FOREWORD ...... 6 2. EXECUTIVE SUMMARY ...... 7 3. LANGUAGE STRATEGIES AT EU LEVEL – AN INTRODUCTION ...... 8 3.1 HISTORICAL BACKGROUND AND LANGUAGES ...... 8 3.2 FRAGMENTED LANGUAGE POLICIES ...... 8 3.3 CHRONOLOGICAL MILESTONES OF EU LANGUAGES POLICIES...... 9 4. AUSTRIA ...... 13 4.1 BACKGROUND ...... 13 4.2 LANGUAGES ...... 13 4.3 RELEVANT ORGANISATIONS ...... 13 4.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 14 5. BELGIUM ...... 16 5.1 BACKGROUND ...... 16 5.2 LANGUAGES ...... 16 5.3 RELEVANT ORGANISATIONS ...... 17 5.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 18 6. CZECH REPUBLIC ...... 19 6.1 BACKGROUND ...... 19 6.2 LANGUAGES ...... 19 6.3 RELEVANT ORGANISATIONS ...... 20 6.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 21 7. DENMARK ...... 22 7.1 BACKGROUND ...... 22 7.2 LANGUAGES ...... 23 7.3 RELEVANT ORGANISATIONS ...... 23 7.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 24 8. ESTONIA ...... 26 8.1 BACKGROUND ...... 26 8.2 LANGUAGES ...... 27 8.3 RELEVANT ORGANISATIONS ...... 27 8.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 27 9. FRANCE ...... 30 9.1 BACKGROUND ...... 30 9.2 LANGUAGES ...... 30 9.3 RELEVANT ORGANISATIONS ...... 31 9.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES PAST LANGUAGE STRATEGIES AND POLICIES ...... 32 10. ...... 34 10.1 BACKGROUND ...... 34 10.2 LANGUAGES ...... 34 10.3 RELEVANT ORGANISATIONS ...... 35 10.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 35 11. LUXEMBOURG ...... 37 11.1 BACKGROUND ...... 37 11.2 LANGUAGES ...... 37 11.3 RELEVANT ORGANISATIONS ...... 38 11.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 39 12. HUNGARY ...... 40 12.1 BACKGROUND ...... 40 12.2 LANGUAGES ...... 40 12.3 RELEVANT ORGANISATIONS ...... 41 12.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 42 13. IRELAND ...... 44 13.1 BACKGROUND ...... 44

644583 | DELIVERABLE D3.1 This project is co-funded by the 2 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

13.2 LANGUAGES ...... 44 13.3 RELEVANT ORGANISATIONS ...... 44 13.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 45 14. ITALY ...... 48 14.1 BACKGROUND ...... 48 14.2 LANGUAGES ...... 49 14.3 RELEVANT ORGANISATIONS ...... 51 14.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 53 15. LATVIA ...... 57 15.1 BACKGROUND ...... 57 15.2 LANGUAGES ...... 57 15.3 RELEVANT ORGANISATIONS ...... 58 15.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 59 16. LITHUANIA ...... 62 16.1 BACKGROUND ...... 62 16.2 LANGUAGES ...... 62 16.3 RELEVANT ORGANISATIONS ...... 63 16.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 64 17. ...... 66 17.1 BACKGROUND ...... 66 17.2 LANGUAGES ...... 66 17.3 RELEVANT ORGANISATIONS ...... 68 17.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 68 18. POLAND ...... 70 18.1 BACKGROUND ...... 70 18.2 LANGUAGES ...... 71 18.3 RELEVANT ORGANISATIONS ...... 72 18.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 72 19. ...... 73 19.1 BACKGROUND ...... 73 19.2 LANGUAGES ...... 74 19.3 RELEVANT ORGANISATIONS ...... 74 19.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 75 20. ROMANIA ...... 78 20.1 BACKGROUND ...... 78 20.2 LANGUAGES ...... 78 20.3 RELEVANT ORGANISATIONS ...... 79 20.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 80 21. SLOVENIA ...... 82 21.1 BACKGROUND ...... 82 21.2 LANGUAGES ...... 82 21.3 RELEVANT ORGANISATIONS ...... 83 21.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 85 22. ...... 86 22.1 BACKGROUND ...... 86 22.2 LANGUAGES ...... 86 22.3 RELEVANT ORGANISATIONS ...... 92 22.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 94 23. ...... 97 23.1 BACKGROUND ...... 97 23.2 LANGUAGES ...... 98 23.3 RELEVANT ORGANISATIONS ...... 100 23.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 100 24. UK – UNITED KINGDOM ...... 102 24.1 BACKGROUND ...... 102 24.2 RELEVANT ORGANISATIONS ...... 104 24.3 NATIONAL AND REGIONAL POLICIES AND STRATEGIES ...... 106 25. ANNEX ...... 112

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 3 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

25.1 BY COUNTRY ...... 112 25.2 PRACTICAL INFORMATION ...... 113 26. COPYRIGHT POLICY ...... 114

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 4 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

DOCUMENT INFO

AUTHORS Name Company E-mail All partners Margaretha Mazura EMF [email protected]

REVIEWERS Name Company E-mail Margaretha Mazura [email protected]

Luz Esparza ZABALA [email protected] Blanca Rodriguez ZABALA [email protected]

DOCUMENT CONTROL Document version Date Change D3.1.1 15/09/2015 First internal draft D3.1.2 08/10/2015 Second internal draft D3.1 v1 09/10/2015 Final version by the consortium to be submitted to the EC.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 5 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

1. FOREWORD The LT Observatory project has as one of its objectives to investigate the national/regional support given to languages, language technology and language strategies in the EU Member States. Languages are a sensitive issue as they determine a people’s culture, tradition and behaviour. At the same time, languages are of considerable economic relevance.

Work package 3 of the LT Observatory project is dedicated to “National support for LT/LR/MT”. Task 3.1 plans to identify EU national/regional strategies and initiatives with regard to languages. Not all, and not many EU Members States have developed a language strategy. is therefore interesting to identify those countries that already identified a language strategy and to raise awareness about it. This is of particular importance, where such language strategies include language technologies (as in the near future, Spain). This can lead to best practice methodologies to support their own language, and may serve as example to other Member States. Results from this document will be published on-line (http://www.lt-innovate.eu/lt-observe/public-policy- observatory/national-language-policies ). Furthermore, results will be included in the Strategic Research and Innovation Agenda (SRIA1) and the MT EcoGuide, foreseen at the end of the project.

D3.1 is described as follows: “List of national and regional strategies for languages, language resources and language technologies”. Therefore, the goal is broader than mere technologies, in order to see the background that led to a strategy, and to investigate future plans and opportunities.

Due to the focus of the LT Observatory project, minority languages that do not have the status of an official or co- will be mentioned but not further elaborated on. This is also valid for immigrant languages.

1 Together with the CRACKER project 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 6 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

2. EXECUTIVE SUMMARY

“The language of Europe is translation” Umberto Eco, 1993

National and strategies are mainly apparent where a country/region wants to preserve its language and fosters it through language learning or preservation strategies. The close relation of language with culture and education is also reflected in the strategy at EU level that was always governed by DG Education and Culture initiatives, often in parallel with others, e.. DG Translation or DG Connect (then InfSo).

It is far rarer to find strategies that involve language technologies at national level. A first attempt can be seen by France in the early 21th century with its Technolangue programme (2003-2006) that is recently taken up again for a potential “Technolangue II”. Countries with “exotic” official languages like Ireland are keener to engage in technologies that can help their language to gain momentum. This can be also seen in the Baltic countries where language strategies promote their official national languages, often as contrast to formerly used languages like Russian. Some Member States promote the learning of several languages in order to enhance one’s own plurilingual portfolio, in line with EU educational policies. The socio-economic element in assessing languages and language technologies is missing in all Member States policies. And hardly any Member States has a real strategy that put forward goals and paths on how to achieve them.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 7 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

3. LANGUAGE STRATEGIES AT EU LEVEL – AN INTRODUCTION 3.1 HISTORICAL BACKGROUND AND LANGUAGES

“There is no more emotional topic in the EU than the language issue” Wilhelm Schönfelder, 2005

Long-term diplomat and then representative of Siemens in Brussels until 2010, Wilhelm Schönfelder2 explains with this sentence why there are continuous ups and downs in Europe’s language policy and strategy. Languages define personal identities, but are also part of a shared inheritance. They can serve as a bridge to other people and open access to other countries and cultures, promoting mutual understanding. A successful multilingualism policy can strengthen the life chances of citizens: it may increase their employability, facilitate access to services and rights, and contribute to solidarity through enhanced intercultural dialogue and social cohesion3.

The very first Regulation issued by the then new EEC (Regulation No. 1, OJ 17, 6.10.1958, p.385) determined the languages to be used by the European Economic Community and declared the equality of all official languages. That was then an easy task, with 6 Member States4 and 4 official languages5.

Currently, the EU has 500 million citizens, 28 Member States, 3 alphabets and 24 official languages, some of them with a worldwide coverage. Some 60 other languages are also part of the EU's heritage and are spoken in specific regions or by specific groups. In addition, immigrants have brought a wide range of languages with them; it is estimated that at least 175 nationalities are now present within the EU’s borders.

Linguistic diversity is enshrined in 22 of the European Charter of Fundamental Rights ("The Union respects cultural, religious and linguistic diversity"), and in Article 3 of the Treaty on European Union ("It shall respect its rich cultural and linguistic diversity, and shall ensure that Europe’s cultural heritage is safeguarded and enhanced.") 6 .

3.2 FRAGMENTED LANGUAGE POLICIES Languages are traditionally associated with culture and education: Language as an expression of culture, and language learning as an essential part of education. Therefore, many language policies at EU level were initiated by DG Education & Culture (EAC). However, the European Commission has supported Human Language

2 ‘Es gibt in der EU kein emotionaleres Thema als Sprachen.’ Wilhelm Schönfelder, cited in Süddeutsche Zeitung, 1 April 2005, quoted in Linguistic diversity and European democracy, ed. Anne Lise Kjær and Silvia Adamo, Farnham: Ashgate, 2010, pp. 57-74 3 Quoted from : http://ec.europa.eu/languages/policy/linguistic-diversity/index_en.htm 4 France, Germany, Italy and 5 French, German, Italian and Dutch. 6 Source: http://ec.europa.eu/languages/policy/linguistic-diversity/official-languages-eu_en.htm 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 8 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Technologies7 for more than 40 years. There was a considerable effort made throughout 1980-1990 which resulted in some pioneering Machine Translation and Translation Memory technologies. Financial support for language technologies reached a peak during the 7th Framework Programme (DG CNECT, then DG InfSo). The current EU ambition to create a Digital Single Market revives the support for language technologies, e.g. for cross- border transactions: More and more commercial transactions are being done online and there are more consumers using the Web that do not speak English than those who do. Recent e-commerce statistics indicate that two out of three EU customers buy only in their own language. This suggests that language is a significant barrier to a truly Europe-wide Digital Single Market. Language barriers do not only impact e-commerce activities, but also have their repercussion on access to content and online services. This refers particularly to eGovernment services that will be taken care of by the Connecting Europe Facility (CEF)8. The multilingual element of this initiative is spearheaded by DG Translation’s MT@EC tool that is open to all public institutions of all Member States and disposes of a corpus of all official EU languages. CEF and Horizon 2020 work hand-in-hand for funding relevant projects that support CEF’s multilingualism.

3.3 CHRONOLOGICAL MILESTONES OF EU LANGUAGES POLICIES 3.3.1 2002 TO 2010 With the enlargement of the EU came a renewed enthusiasm on traditional topics, such as multilingualism. The first decade of the 21th century is a perfect example of it.

2001 European Year of Languages 2001 was declared the European Year of Languages by the European Union, the Council of Europe and UNESCO. Celebration of the European Day of Languages on 26 September for the first time.

2003: The Language Action Plan (2004-2006) : “Promoting Language Learning and Linguistic Diversity”. The purpose of this action plan is to promote language learning and linguistic diversity. It defines specific objectives and a set of actions to be implemented between 2004 and 2006. COM (2003)449

2005: A New Framework Strategy for Multilingualism In November 2005, the Commission published a Communication entitled “A New Framework Strategy for Multilingualism”, its first-ever Communication on this .

2006: ELAN Study: Effect on the European Economy of Shortage of skills in Enterprises Study carried out by CILT for the EC. December 2006

7 HLT include natural language processing, speech technology, machine translation, information extraction, data analytics etc. 8 See, for example, http://www.rigasummit2015.eu/sites/rigasummit2015.eu/files/cef_29_04_2015_aleksandra_wesolowska_cef_automated_translation_dsi_setting_the_sce ne.pdf 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 9 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

2007: Establishing Multilingualism portfolio From 1 January 2007 until 9 February 2010, a specific multilingualism portfolio had been created. Leonard Orban (Romania) held the position. But its shelf-life was short. With the Barroso II Commission, the position of multilingualism fell back to the Commissioner for Culture, Education, Multilingualism and Youth.

November 2007: Establishing of Business Forum for Multilingualism Commissioner Orban established this Forum under the chair of Viscount Etienne Davignon. Final Report presented July 2008.

Council conclusions of 22 May 2008 on multilingualism The Conclusions build on discussions held at the Education Council in November 2007 and the Ministerial Conference on Multilingualism held on 15 February 2008 and deals mainly with the equality of languages and the importance of language learning.

2008: Multilingualism: an asset for Europe and a shared commitment {SEC(2008) 2443} {SEC(2008) 2444} Commission Communication that dealt with different aspects of multilingualism, including competitiveness and technology.

2009 Establishing of the Business Platform for Multilingualism9 An initiative of DG EAC, September 2009. This initiative as well as the above Business Forum for Multilingualism showed the vision of DG EAC at the time that went beyond the mere educational purpose towards lifelong learning, with the purpose of a thriving entrepreneurship in Europe. This endeavor was further accomplished by the (Lifelong Learning) CELAN project10 (see below).

2009 Establishing of the Civil Society Platform for the promotion of Multilingualism An initiative of DG EAC, October 2009. This initiative established the poliglotti4.eu web platform.

3.3.2 2010 TO 2015 This period was governed by relatively high budgets for language technologies in FP7 that led to strategically crucial development for Europe’s language technology landscape.

2010: T4ME project The FP7 project T4ME is at the origin of the META-NET network of LT researchers that brings together leading scientists and researchers as well as other stakeholders in the language technologies sector.

9 The author of this deliverable held the position of Communication Manager of the platform. 10 LTO partner EMF was also partner of CELAN 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 10 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

2011: LT Compass project11 The FP7 project LT Compass is at the origin of LT-Innovate, the association of Europe’s LT Industry.

2011: Language Guide for European Business

2011: Inventory of Community actions in the field of multilingualism The staff working document complements and underpins the Communication on Multilingualism, by mapping the actions that the different services of the Commission have already undertaken or are carrying out in this field. It already postulates that “information and communication technologies (ICT) must become more language-aware and support content creation and distribution in multiple languages while providing effective was of bridging the language barrier, for both inter-personal and business purposes.

2011: CELAN project From 2009-2013 the Commission coordinated the business platform that gave input to the Network for the Promotion of Language Strategies for Competitiveness and Employability (CELAN) . Its aim was to identify the language needs of EU firms and employees and provide tools to meet, including the opportunities that language technologies can offer (2011-2013). This project was funded by the Lifelong Learning project of DG EAC in the wake of the DG EAC initiative “Business platform for Multilingualism”.

2012: 1st LIND Web Forum The LIND website of DG Translation compiles facts & figures about the EU language industry. LIND-Web is a spin- off of the study Size of the EU language industry12.

2013: End of FP7 In 2012, Call 10, 11 as well as the SME initiative disposed of 73 MEUR for Language technologies (including data analytics, nowadays a key element of Big Data). In Horizon 2020 and the Digital Agenda, language technologies were merged with the Data Value Chain. FP7 page on Language Technologies on CORDIS (for “historical” information)

2014: Europe’s Digital Agenda, Horizon 2020 and CEF These two funding programmes support language technologies in a broader vision, i.e. the Digital Single Market and the overarching vision of the Digital Agenda. http://ec.europa.eu/digital-agenda/en/science-and- technology/language-technologies

11 LTO partner EMF was also partner in LT Compass 12 The study was carried out by LTC, partner of LT Compass and founding member of LT Innovate. 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 11 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

In the Work Pogramme 2014/15, Language Technologies featured as “Cracking the Language Barrier”. A portfolio of past and current supported projects can be found at: http://ec.europa.eu/digital-agenda/en/node/6283#EU Investments (DG CNECT LT project portfolio)

DG Translation activities can be found at: http://ec.europa.eu/dgs/translation/programmes/languageindustry/platform/index_en.htm

2015: Council document The Council of the EU adopted on 21 May 2015 a Communication on “Digital Single Market policy - a) Draft Council conclusions on the digital transformation of European industry”. In it, under deliberation 7 (p.7), it says: “[…]NOTES that digital tools can play an important role in exploiting the full potential of multilingualism for doing business in the Single Market, particularly for SMEs with relatively limited capacity in the areas of administration, finance and management; INVITES the Commission to encourage the development of interoperable digital tools, for example in the area of machine translation;”

2015: Horizon 2020 ICT Work Programme 2016/17 The most recent draft version of the ICT Work Programme 2016/17 of Horizon 2020 mentions in its Introduction (p.6) the following:

“Application of Language Technologies is supported under topics ICT-14, 15, 16 (Big data PPP). Proposers addressing other topics are encouraged to make use of Language Technologies (e.g. machine translation, speech recognition, dialogue management, text analysis, text generation), if the proposal involves analysis or interpretation of information expressed in human language, or if the proposal addresses human-to-human or human-to-machine interaction or communication.”

While this is not yet a funding portfolio as under FP7, it may be a sign that decision makers become slowly aware that Language Technologies have a strategic importance for Europe. But without a specific portfolio as in past programmes, progress towards broader goals, in particular infrastructures will be limited.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 12 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

4. AUSTRIA 4.1 BACKGROUND 4.1.1 COUNTRY CHARACTERISTICS The Republic of Austria is a federal state and has about 8.6 million inhabitants. It has 9 provinces, Vienna is the federal capital. Austria has been a member of the European Union since 1995 and is a member of Schengen and of the Euro zone.

4.2 LANGUAGES 4.2.1 OFFICIAL German (Deutsch) is the official language (Amtssprache). is a standard variant of German. There are many German spoken in all provinces, with Alemannic dialects in Vorarlberg and Bavarian (south- east) dialects in all other provinces.

4.2.2 CO-OFFICIAL LANGUAGES The regional co-official languages of official minorities (regionale Amtssprachen, Minderheitensprachen) in Austria are Slovenian/Slovene in Carinthia and Styria, Burgenland-Croatian in Burgenland, Romanes/Romany in Burgenland, and Hungarian in Burgenland and Vienna, Czech in Vienna, as well as Austrian .

Due to the ethnic diversity due to immigration processes there are a number of major additional languages such as Turkish, Serbian, Bosnian, Slovak, Chinese, Persian, , Kurdish, Polish, Albanian, Romanian, Italian, etc.

FIGURE 1 LANGUAGES IN AUSTRIA (SOURCE: WIKIMEDIA)

4.3 RELEVANT ORGANISATIONS For primary and secondary education the Federal Ministry of Education and Women regulates the use of language(s) in schools. The Federal Ministry of Science, Research and Economy is supporting research and development in language technologies and language resources in the context of European and international initiatives. The Federal Chancellery is responsible for minority languages and their rights, for information society, open government and related issues that also include language resources. The Austrian Standards Institute is also

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 13 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY dealing with standardized terminology in the context of standards development at national, European and international levels. The European Centre for Modern Languages of the Council of Europe (ECML) in Graz is dealing with migration languages, language learning, language competence levels, evaluation and assessment, etc. Relevant international NGOs located in Vienna are INFOTERM the International Information Centre for Terminology (founded by UNESCO in 1971) and TermNet, the International Terminology Network for Terminology (since 1988). The national defence academy in Austrian Army includes a language institute (Sprachinstitut des Bundesheeres) covering languages services (translation, interpreting), language teaching and terminology management.

4.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 4.4.1 PAST LANGUAGE STRATEGIES AND POLICIES (NOT APPLICABLE) 4.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES The Austrian language policy in the area of education is being taken care of by the Federal Ministry of Education and Women in a European context. Austria is pursuing its goals in the policy in close cooperation with European institutions and is actively involved in all relevant language related programmes of the Council of Europe and of the European Union (e.g. the common European reference framework for languages, the European language portfolio, languages in education, European day of languages, new media in language education, language integration of migrants, etc.). A particular focus is put on the promotion of multilingualism and linguistic diversity. In this context the LEPP-process (language education policy profiles) is of particular importance.

The European Centre for Modern Languages (ECML) of the Council of Europe located in Graz is of particular importance, not only at the European level, but obviously also for the national activities, policies and strategies in modern language learning in the educational context. The Austrian language competence centre (ÖSZ) is offering direct support to schools in terms of language courses, didactics, new media, learning materials, multilingualism, content and language integrated learning (CLIL), quality label awards, teacher training, support, immigrant language support, etc., at all levels of education. The Austrian language committee ÖSKO (Österreichisches Sprachenkomitee) is participative platform for promoting multilingualism and linguistic diversity. This committee is jointly operated by the Federal Ministry of Education and Women, the Federal Ministry of Science, Research and Economy and the Austrian language competence centre.

In this context the promotion of the German standard in Austria (Österreichisches Deutsch, Deutsch in Österreich) is important in practical implementation (e.g. in the Austrian language diploma ÖSD Österreichisches Sprachdiplom Deutsch) and in academic research.

The Federal Ministry of Transport, Innovation and Technology (BMVIT) pursues an “Austrian ICT of the Future” programme in the areas of systems of systems, trusted and intelligent systems and interoperability. Language

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 14 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY technology and natural language processing Research and Development is particular importance and relevance to data analytics, big data, data integration, semantic processing and semantic systems, knowledge work, ontology engineering, etc. and they are thus of key importance to the national technology innovation policy. In the past specific national funding programmes for “semantic systems” were successfully pursued by the BMVIT, boosting collaborative R&D between companies and research organisations in the fields of language technologies, ontology engineering, and other language-related topics in ICT R&D.

The Federal Ministry of Science, Research and Economy is an active member of the European ESFRI strategy forum for research infrastructures and is actively supporting the Social Sciences and Humanities (SSH) research infrastructure initiatives at national level, among them the research infrastructure for language resources and language technologies CLARIN. Austria has been a founding member of CLARIN in 2007 and of CLARIN ERIC in 2012. The ministry’s policy focuses on sustainability of linguistic research infrastructures in Austrian research institutions, in particular in Austrian universities and the Academy of Sciences. The policy of the ministry has been to empower a national consortium in the context of CLARIN, CLARIN-AT, to build up a national language resource and language technology research infrastructure, in particular coordinated by the University of Vienna (Centre for Translation Studies (UNIVIE-CTS) and the Institute for Corpus and Text Technology (ICLTT), more recently re-organised and widened as a federal infrastructure under the Austrian Centre for Digital Humanities (ACDH), where the details of the national strategy are developed and implemented under the auspices of the Ministry and in close coordination with CLARIN ERIC Board of Directors. UNIVIE-CTS as well as ÖFAI (Austrian Research Centre for Artificial Intelligence) also have actively participated as national delegations in the European META project and have been actively contributing to Austria’s national R&D policies and strategies and their implementation in research projects.

What is needed at national level is more active cooperation among all federal ministries including the federal chancellery in developing a stable inter-ministerial federal national policy on language related issues, since language education is increasingly linked to language technologies, which in turn are linked to research infrastructures, and multilingualism is related to social affairs, migration and integration management, etc.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 15 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

5. BELGIUM 5.1 BACKGROUND 5.1.1 COUNTRY CHARACTERISTICS The Kingdom of Belgium is one of the founding members of the European Union and forms part of the Customs Union together with the Netherlands and Luxembourg (BENELUX union). The population of Belgium counts more than 10 million people based on the latest statistics. The capital of Belgium is Brussels which is called the “heart of Europe” where all the European Institutions are located. Belgium is a federal state and is composed of three regions which are: Wallonia, Flanders and Brussels. The regions have legislative power for “territorial” issue whereas the “people” (including culture and education as well as languages) are in the competence of the three communities that correspond to the languages: the French community, the Flemish community and the German community.

FIGURE 2 MAPS OF BELGIANS LANGUAGES: LEFT, A SMALL ENCLAVE OF FRENCH IN FLANDERS, RIGHT, A SMALL ENCLAVE OF FLEMISH IN WALLONIE. THE STRIPED DOT IS BI-LINGUAL BRUSSELS.

5.2 LANGUAGES 5.2.1 OFFICIAL The official are three: French, Dutch and German. Brussels if officially bi-lingual but French is most widely spoken. Roughly 59% of the Belgian citizens belong to the Flemish Community, 40% to the French Community and 1% to the German-speaking Community. Citizens are not required to indicate their mother tongue, but the language they prefer to communicate with to the authorities. Belgium guarantees constitutionally

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 16 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY a “freedom of language” (for the private sphere), Article 30 specifies that "the use of languages spoken in Belgium is optional; only the law can rule on this matter, and only for acts of the public authorities and for legal matters."

5.2.2 CO-OFFICIAL LANGUAGES There are no co-official languages in Belgium. Other than three official languages in Belgium, inhabitants speak different forms of French (Walloon dialects), Dutch and . The Brabantian-French of “Bruxellois”, the language of Brussels’ inhabitants, is almost totally extinct.

English is widely spoken throughout Belgium as a second or third language by native Belgians, and is sometimes used as a in Brussels, in particular at international conferences and at European Institutions.

5.3 RELEVANT ORGANISATIONS  Human Language Technology Central-This is a portal where many language resources for the are available. 13  KU Leuven- University of Katholieke Universiteit in Leuven offers courses for speech and language technologies.  House of Dutch (Huis van het Nederlands)- A nonprofit organization which is financed from the Flemish government. The aim of this organization is to find the right courses for the adults who want to learn Dutch. This is organization is established in order to promote the language of Dutch for foreigners and also for the students that come to study in Brussels14.  The Nederlandes Taalunie- This is the intergovernmental policy organisation for the Dutch language15 of which Flanders is part.  Flemish Government – Department of Economy, Science & Innovation who is incharge of ICT including Language Technologies  De Randa- This is a center of Dutch language which is sponsored by the Flemish government. It offers Dutch lessons16.  Alliance française de Bruxelles- is a non-profit association that forms part of a worldwide network of 829 locally governed association in 137 countries. The aim of this association is to provide instruction and to promote the francophone culture 17.  Fédération Wallonie Bruxelles 18 in charge of all French issues in the Brussels region, including research and new technologies.

13 http://tst-centrale.org/ 14 http://deredactie.be/cm/vrtnieuws.english/flanders%2Btoday/1.959986 15 http://taalunieversum.org/ 16 http://welkom.derand.be/en 17 http://www.alliancefr.be/en/alliance_fr.html 18 http://www.federation-wallonie-bruxelles.be/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 17 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Le CERAN (Lingua International) – Is a language center to enrich the French language19.  BXL Academy- This is a center of linguistics which is located in Brussels that offers Dutch and French courses20.  Institut für Deutsche Sprache- The Institute for the (IDS) which was founded in 196 it is the central for the research and documentation of the German language in the current use and its recent history21. The German-speaking community is oriented by this institute.  Parliament of the German-speaking community22 is the reginal authority for issuing laws and decrees regarding the competence of the community, in particular culture and education that both include languages. The current Parliamentary president is also Vice-president of the Committee of the Regions.  STEVIN-archief- This was a 6 year Dutch-Flemish Research Programme (until 2013) for Dutch language and Speech Technology. The aim of this Research Programme was to contribute to the further progress and stimulate further research in HLTD in Flanders and Dutch. This programe was financied by the Flemish and Dutch governments (Minsitry of Education, Cultural and Science and the Netherlands Organisation for Science and Research)23.

5.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 5.4.1 PAST LANGUAGE STRATEGIES AND POLICIES 1830-1898- In 1830 the Belgium Constitution was written entirely in French and the Flemish Movement required the recognition of Dutch as an official language with the same status. During this period, successive Language Laws (1873, 1978, 1893) and finally in 1898 the Law of Linguistic Equality placed Dutch on equal footing with French as official language. 24 However, while Flemish-speaking people usually spoke and understood French, the French did not (want to) speak Flemish.

A 1962 law determined which belonged to what language area. In the same year another law was enforced in regard to the Linguistic regime of teaching which is clearly stated in the Article 4 that: In the Dutch regions the language of teaching has to be Dutch. The same law followed for the French regions that the language of teaching should be French. However, adjacent communities apply what is called “language facilitation”: Citizens in these regions can chose which language to use in communication with public authorities. An important Treaty which was signed on September 1980 on Netherlands Taalunie between the Kingdom of Belgium and the Kingdom of Netherland was to establish a unity with these two languages. This treaty aimed that these two countries will follow a common policy in regard to the language policy.

19 http://www.ceran.com/fr 20 http://www.bxlacademy.be/EN/Home?gclid=CO39tc7mssgCFUX4wgodwJMF1w 21 http://www1.ids-mannheim.de/ 22 www.pdg.be 23 http://tst-centrale.org/stevin/english/ 24 http://www.efnil.org/documents/language-legislation-version/belgium/belgium

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 18 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

5.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES In particular Flanders supports the use of Language technologies through different funding mechanisms. There is a roadmap proposal/project by CrossLang pending. In the near future, the Dutch Language Union will order a report on publishing support software for Dutch. Overall, the LT market is very fragmented in Belgium (as in Europe in general), not only due to the multilingualism of the country.

6. CZECH REPUBLIC 6.1 BACKGROUND 6.1.1 COUNTRY CHARACTERISTICS The Czech Republic (CR) has a population of around 10.2 million (2.2% of the total number of all EU citizens). The country came into existence in 1993 as a result of the splitting up of the former Czechoslovakia (which had approx. 15 million inhabitants). The European Association Treaty of 1993 gave the Czech Republic the status of an associated country entered as of February 1, 1995 and in January 1996 the State applied for membership to the European Union. The majority of Czech citizens supported EU membership in the referendum held on 13-14 June 2003. The Czech Republic joined the European Union on May 1, 2004.

In 2010, almost 60% of Czechs were Internet users. Most of them say they are online every day. Among young people, the proportion of users is even higher. In January 2011, more than 750 thousand ".cz" domains were registered. These numbers suggest the vast amount of data available on the web.

6.2 LANGUAGES 6.2.1 OFFICIAL LANGUAGES The official language is Czech and it is used by about 96% of the population. However there is no special language law. In 2004, a proposal from Communist MPs for an amendment to the Constitution that would implement a national and official language was rejected.

CR citizens that belong to national and ethnic minorities can use their own language according to the Charter of Fundamental Rights and Basic Freedoms. If they have an interpreter, the state will pay the cost. The exceptions are the Code of Criminal Procedure and Code of Civil Procedure that guarantee the right to an interpreter during court proceedings and with law enforcement authorities, but without reimbursement of the cost.

Leaflets and other publications must be published in the Czech language as defined by the Act on Consumer Protection. Based on data from the Czech Statistical Office, as of 31 December 2013 the Czech Republic had 10 512 419 inhabitants. The Czech Housing and Population Census consistently include a question on ethnicity. The last such survey was conducted in March 2011.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 19 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

German, Polish, Hungarian, Ukrainian, Romany, Slovak and Croatian are spoken in the Czech Republic, though only the first four are recognized as official minority languages25. The second-largest language by number of speakers (after Czech) is the ; followed by Polish, German and Romani.

Article 25 of the Charter (on minorities) provides for education in minority languages and the Minority Act guarantees the right to be educated in the minority language from nursery school level through to secondary schools.

6.2.2 CO-OFFICIAL LANGUAGES None

6.3 RELEVANT ORGANISATIONS The Institute of the Czech Language (of the Academy of Sciences of the Czech Republic) — Ústav pro jazyk český (Akademie věd České republiky) - is widely accepted as the regulatory body of the Czech language. Its recommendations on standard Czech (spisovná čeština) are viewed as binding by the educational system, newspapers and others, although there is no legal basis for such recommendations.

The CR was one of the first countries to apply the Common European Referential Framework for Languages. The Institute of Czech Language codifies the orthoepy, , morphology and . The public is very sensitive to language changes such as the rather limited spelling reform in 1993. In common communication, most people prefer non-literary Czech. The most widespread variety is so-called Common Czech (based on the Central Bohemian dialect). In Moravia and Silesia, the remnants of dialects (Hanak, Lach, CzechoMoravian) are still used actively in the spoken form.

Most of the government-sourced funding programmes are maintained by the Czech Science Foundation (GAČR) and focused on basic research. In 2009, the Technology Agency of the Czech Republic (Technologická agentura České republiky, TAČR) was established, tasked with applied research. So far there are no LT-related projects funded by TAČR.

There is also an Institute of the Czech National Corpus (http://ucnk.ff.cuni.cz/english/index.php ), Faculty of Arts, Charles University in Prague, tasked with building and maintaining a corpus of Czech language data.

Resources for learning Czech: http://kam.mff.cuni.cz/~uli/czechlinks.html A listing of Czech language resources for translators from the EU is available here.

25 Source: http://www.gencat.cat/llengua/noves/noves/hm04tardor/docs/zwilling.pdf 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 20 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

6.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 6.4.1 CURRENT NATIONAL POLICIES AND STRATEGIES The Constitution does not include any specific mention of an official or state language, nor does any other Czech law define an official language or a specific language to be used for official communication. The official status of the Czech language is however implicit in certain legal regulations. German, Polish, Hungarian and Ukrainian are recognized as official minority languages. Romani, Slovak and Croatian are also spoken in the Czech Republic. The right of ethnic and national minorities to use their language in communication with authorities is primarily based on the Constitution, Article 25/2/b. The Government notes that “in respect of the Romani national minority, one unresolved problem is the highly insufficient number of Romani interpreters.”

The Czech state has been relatively active in spreading Czech abroad (teaching fellowships), but there is no institution specializing in the propagation of Czech, comparable to the Goethe Institut for German or Polonicum for Polish. Teaching fellowships are only represented in half of EU Member States).

Due to the fact that the Czech Republic has attracted tens of thousands of foreigners as economic migrants from within the EU and from external migration, the teaching of Czech as a foreign language has become a significant pedagogical challenge, for which resources are needed.

NEEDS The CR language technology community is “cautiously optimistic” about the current state of language technology support. There is a viable LT research community in the Czech Republic, which has been supported in the past by various national and EU research programmes; a small number of resources and technologies have been produced for Czech. However, these resources and tools are still very limited when compared to the resources and tools for languages with much larger speaker populations.

The Czech language technology industry dedicated to transforming research into products is currently small, fragmented and disorganized. Most large companies have either stopped development or severely cut their efforts. The country’s few specialized SMEs are not robust enough to address the internal and the global market on a sustained basis, even though there are one or two large localisation/translation players based in the country. There is therefore a substantial need for new infrastructure effort and a more coherent research organization to spur greater sharing and cooperation for building the requisite resources at national level.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 21 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

7. DENMARK 7.1 BACKGROUND 7.1.1 COUNTRY CHARACTERISTICS Denmark is a constitutional monarchy with a parliamentary system in Northern Europe. The southernmost of the Nordic countries, it is located southwest of Sweden and south of Norway, and bordered to the south by Germany. Denmark forms part of the cultural region called , together with Sweden and Norway. The Kingdom of Denmark is a sovereign state that comprises Denmark and two autonomous constituent countries in the North Atlantic Ocean: the Faroe Islands and Greenland. Denmark proper has an area of 43,094 square kilometres (16,639 sq mi), and a population of 5,678,348 (July 2015). The country consists of a peninsula, Jutland, and an archipelago of 443 named islands, of which around 70 are inhabited. The islands are characterised by flat, arable land and sandy coasts, low elevation and a temperate climate.

The Constitution of Denmark was signed on 5 June 1849, ending the absolute monarchy which had begun in 1660. It establishes a constitutional monarchy—the current monarch is Queen Margrethe II—organised as a parliamentary democracy. The government and national parliament are seated in Copenhagen, the nation's capital, largest city and main commercial centre. Denmark exercises hegemonic influence in the Danish Realm, devolving powers to handle internal affairs. Denmark became a member of the European Union in 1973, maintaining certain opt-outs; it retains its own currency, the krone. It is among the founding members of NATO, the Nordic Council, the OECD, OSCE, and the United Nations; it is also part of the Schengen Area.

7.1.2 LANGUAGES OVERVIEW SITUATION Danish is the de facto national language of Denmark and the official language of the Kingdom of Denmark. Faroese and Greenlandic are the official regional languages of the Faroe Islands and Greenland respectively. German is a recognised minority language in the area of the former South Jutland County (now part of the Region of Southern Denmark), which was part of the German Empire prior to the Treaty of Versailles. Danish and Faroese belong to the North Germanic (Nordic) branch of the Indo-European languages, along with Icelandic, Norwegian and Swedish. The languages are so closely related that it is possible for Danish, Norwegian and Swedish speakers to understand each other with relatively little effort. Danish is more distantly related to German, which is a West Germanic language. Greenlandic or "Kalaallisut" belongs to the Eskimo–Aleut languages; it is closely related to the Inuit languages in Canada, such as Inuktitut, and entirely unrelated to Danish.

A large majority (86%) of Danes speak English as a , generally with a high level of proficiency. German is the second-most spoken foreign language, with 47% reporting a conversational level of proficiency. Denmark had 25,900 native German speakers in 2007 (mostly in the Southern Jutland region).

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 22 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

7.2 LANGUAGES 7.2.1 OFFICIAL Danish is the official language of Denmark. It is also the native or cultural language of around 50,000 Germano- Danish citizens living in the south of Schleswig and the Danish who emigrated to America and Australia preserve, to a certain extent, their native language. In international relations terms, Danish has been one of the official languages of the European Union since 1973.

7.2.2 CO-OFFICIAL LANGUAGES Besides the official language Danish, there are several minority languages spoken through the territory. These include German, Faroese, and Greenlandic.

7.2.2.1 GERMAN German is an official minority language in the former South Jutland County (part of what is now the Region of Southern Denmark), which was part of Imperial Germany prior the Treaty of Versailles. Between 15,000 and 20,000 Ethnic Germans live in South Jutland, of whom roughly 8,000 use either the or the Schleswigsch variety of Low Saxon in daily communications. Schleswigisch is highly divergent from Standard German and can be quite difficult to understand by Standard German speakers. Outside of South Jutland, the members of St. Peter's Church in Copenhagen use German in their Church, its website, and the school that it runs. The German minority operates its own system of primary schools with German as the primary language of instruction as well as a system of libraries throughout South Jutland. It also operates a German high school located in Aabenraa (German: Apenrade). Beside this there are also 28,584 immigrants from Germany in Denmark by 2012.

7.2.2.2 FAROESE Faroese, a North Germanic language like Danish, is the primary language of the Faroe Islands, a self-governing territory of the Kingdom. It is also spoken by some Faroese immigrants to mainland Denmark. Faroese is similar to Icelandic, and also the language spoken in the Scandinavian area more than a millennium ago.

7.2.2.3 GREENLANDIC Greenlandic is the main language of the 54,000 Inuit living in Greenland, which is, like the Faroe Islands, a self- governing territory of Denmark. Roughly 7,000 people speak Greenlandic on the Danish mainland.

7.3 RELEVANT ORGANISATIONS  The Council26  Society for Danish Language and Literature 27

26 http://nydsn.magenta.dk/dsn.dk

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 23 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 ModersmålSelskabet (‘the Mother Tongue Association’)28  Svenska Akademien (The )29

7.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 7.4.1.1 GERMAN There are no main legal provisions in force concerning the use of regional or minority languages in Denmark. Around a third of the 20,000 Germano-Danish citizens in South Jutland speak German.

7.4.1.2 FAROESE In the Faeroese Isles, the law of autonomy guarantees official equality of Danish alongside the Faeroese language and Danish is an obligatory subject in schools. In Iceland, Danish has been a part of the school curriculum since the end of the 1990s and Danish is still used to facilitate communications with other Nordic countries.

7.4.1.3 GREENLANDIC In Greenland, the law of autonomy guarantees official equality of Danish alongside the Greenlandic language and Danish is an obligatory subject in schools. In Iceland, Danish has been a part of the school curriculum since the end of the 1990s and Danish is still used to facilitate communications with other Nordic countries.

7.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES A Danish Language Council (Dansk Sprognævn) was created in 1955. This Council is a Centre for Research attached to the University of Copenhagen and falls under the authority of the Ministry of Culture. Its purpose is threefold: to modernise the language by creating neologisms, to set new rules (it publishes the Official Danish Dictionary) and to respond to questions from users.

The Danish Language Council must:  monitor the development of the Danish language and give advice and information on it. It determines the spelling of Danish;  edit publications on the Danish language, in particular those on the use of the native language, and co-operate with institutions of terminology, dictionary editors and public institutions involved in authorising or registering people’s names and surnames and brand names;  collaborate with equivalent language councils and institutions in other Nordic countries.  function as a secretariat for the Danish Sign Language Board

27 http://dsl.academia.edu/ 28 http://www.iaimte.com/ 29 http://www.svenskaakademien.se/en

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 24 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Because the laws on Danish usage are really along the lines of recommendations, the framework is not especially restrictive and is a long way from being systematically applied.

The Danish Language Council adopted a four-point plan for a Danish linguistic policy in 2003. The following points were underlined as being central objectives in the Danish linguistic policy:  Danish as a scientific and higher education language;  Correct Danish as a working language in public services;  Education strengthened by Danish at all levels;  Education strengthened by foreign languages.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 25 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

8. ESTONIA 8.1 BACKGROUND 8.1.1 COUNTRY CHARACTERISTIC Estonia is the most northerly of the three Baltic states, and has linguistic ties with . Since regaining its independence with the collapse of the Soviet Union in 1991, Estonia has become one of the most economically successful of the European Union's newer eastern European members.

It experienced its first period of independence in 1918, following the end of the First World War and the collapse of the . But the new state, which underwent periods of both democratic and authoritarian rule, was short-lived. After only 20 years, Estonia was forcefully incorporated into Soviet Union in 1940, following a pact between Hitler and Stalin. German troops occupied Estonia during World War II, before being driven out by the Soviet army. Few nations formally recognized the Soviet annexation, and consider it an illegal occupation. One of its legacies is a large Russian minority - about a quarter of the population, according to the 2011 census. In Soviet times, the influx of non-Estonians led some to fear for the survival of Estonian culture and language.

The Russians' status has been a cause of controversy. Some, including the Russian government, criticize requirements needed to obtain Estonian citizenship - especially the need to show a proficiency in the - that left most ethnic Russians stateless after independence. Estonia says the criteria for citizenship are similar to those of most nations around the world, and have been in any case gradually eased. It says the number of stateless persons has dropped by 80% between 1992 and 2013.

Since independence, Estonia has politically and economically anchored itself firmly to the West, joining the EU and Nato in 2004. It sent a contingent of troops as part of NATO operations in Afghanistan. Russia's intervention in the Ukrainian crisis in 2014 has triggered some nervousness in Estonia over President Vladimir Putin's intentions towards other former Soviet states. The Estonian government has been fiercely critical of Russia's behaviour and has affirmed its pro-Nato stance in response to the events in Ukraine. Estonian governments have tended to pursue strongly free-market economic policies, privatizing state enterprises, introducing a flat-rate income tax, liberalizing regulation, encouraging free trade and keeping public debt low.

There has also been a strong emphasis on making Estonia a world leader in technology, leading some to speak of an "e-economy". This has included creating one of the world's fastest broadband networks, offering widespread free wireless internet, encouraging technology start-ups and putting government services online. In 2007, Estonia was the first country to allow online voting in a general election. The country experienced an investment boom in the early 2000s, especially after EU membership, with high annual growth rates hovering between 7-10%.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 26 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

In 2008, Estonia's economy was hit by the global financial crisis. The government adopted tough austerity measures and won plaudits for getting the economy back into shape. The country joined the European single currency in January 2011.

8.2 LANGUAGES 8.2.1 OFFICIAL The only official language in Estonia is Estonian. The Estonian language is closely related to Finnish and - more distantly - Hungarian, but not to the Indo-European languages of the two other Baltic States - Latvia and Lithuania - or Russian for that matter. Around one million people speak Estonian as their mother tongue. Varieties of Estonian include the regional varieties, such as the Setu and Voru varieties, spoken in the South-Eastern corner of the country. The state supports the use of regional varieties and their preservation as a cultural treasure, as a development source of Standard Estonian and as bearer of local identity. Schools in South Estonia quite often teach local dialects as an optional subject.

8.2.2 CO-OFFICIAL LANGUAGES None, but see above (regional varieties).

8.3 RELEVANT ORGANISATIONS In order to preserve the Estonian language, several state institutions have been established. The Language Inspectorate checks the enforcement of the legislative acts concerned with language matters. The Language Policy Department at the Ministry of Education and Research is involved with policy planning and helps to make the language better known abroad. The Estonian Language Council is the ministry’s advisory council on language and has compiled the strategy for maintaining and developing the Estonian language.

The Centre of Estonian Language Resources (CELR) is listed by the Estonian Research Infrastructure Roadmap as a nationally important (see also the Development Plan 2011-2017). The centre acts as an infrastructure for a consortium of three institutions and is the CLARIN ERIC centre in Estonia.

8.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 8.4.1 PAST LANGUAGE STRATEGIES AND POLICIES In 2006, the ”National Programme for Estonian Language Technology” (NPELT 2006-2010) was launched. The main objective was to advance the language technology support for the Estonian language to the level that would enable the Estonian language to function successfully in today’s information technology environment. NPELT has financed research and development in language technology; from the building of resources to the building of prototypes for language technology applications.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 27 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

The measures for preservation and development of the Estonian language are defined in the Development Strategy of the Estonian Language (2004-2010) and the Estonian Language Development Plan (2011-2017). Practical language usage in Estonia is regulated by the Language Act.

8.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES The most recent policy document is the Development Plan mentioned above (2011-2017). This is a document which lays down main strategic directions for the development, teaching, researching and protection of Estonian. The development strategy includes language technology; through this Plan there is a continuing National Programme regarding support of Estonian language technology going on. This programme will focus more on applications and on making the developed resources and tools publicly available.

The Development Plan includes a chapter titled “Language Technology Support of the Estonian Language’’, which sets the following objective: ‘’The level of language technology support of the Estonian language is on par with the languages of language-technologically advanced countries (e.g. the Nordic countries) in areas which are required by the developments and applications of software aimed at the Estonian language’’.

The Development Plan includes a chapter on CLARIN, as an ESFRI Roadmap infrastructure, and stipulates that Estonian researchers should benefit from the wealth of pan-European language resources and technologies. In order to reach this, and to add Estonian language resources to this infrastructure, the Centre of Estonian Language Resources was created.

The Ministry of Education and Research is also funding more research-oriented projects on language technology, using targeted financing schemes and grants of the Estonian Science Foundation. One such National programme was NPELT (see above), succeeded by ECLM (Estonian language, and cultural memory).

NEEDS Although Estonia has a reasonable level of the most basic language technology tools and resources, they are on the whole rather simple and have a limited functionality for some of the areas. There is a significant language technology research scene in the country, but unfortunately hardly any involvement from industry, apart from a few SMEs.

The need for large amounts of data and the complexity of language technology systems makes it very important to develop a new infrastructure and a more coherent research organization.

There is also a lack of continuity in research and development funding, and a need for more coordination with programmes in other EU countries and at the EC level.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 28 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

CURRENT ACTIONS Most current actions have and are being carried out within the National Programme on Estonian Language Technology 2011-2017, which consists of 5 main sub-objectives:  Research and development projects for building software prototypes  Projects for building language resources  Creating a central depository for managing resources and software (CELR)  Integrated language software and its applications  Development projects (within all sub-objectives, with open competition)

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 29 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

9. FRANCE 9.1 BACKGROUND France is like Italy, Germany and the BENELUX countries a founding member of the EU, then EEC.

9.1.1 COUNTRY CHARACTERISTICS France, officially the French Republic (French: République française), is a sovereign state comprising territory in Western Europe and several overseas regions and territories (French Guiana, French Polynesia, Gouadaloupe, Martinique, Mayotte, New Caledonia, Réunion, Saint Barthélémy, Saint Martin, Saint Pierre et Miquelon, Wallis and Futuna) . The European part of France, called Metropolitan France, extends from the Mediterranean Sea to the English Channel and the , and from the Rhine to the Atlantic Ocean; France covers 640,679 square kilometres and as of August 2015 has a population of 67 million, counting all the overseas departments and territories (which total 2.7 million inhabitants)30.

9.2 LANGUAGES The include the French language and regional languages. The French constitution, in its Title 1, Article 2 states that “the language of the Republic shall be French”. This article prevented France so far to ratify the European Charter for Regional or Minority Languages of which it is a signatory.

French centralisation led to the fact that regional languages, although actively spoken, were repressed31.

François Hollande wants to ensure a clear legal framework for regional languages within a programme of administrative decentralisation that would give competencies to the regions in language policy. But end of July 2015, the Constitutional Council gave (again) an unfavourable opinion to a constitutional change. The reasons: The charter endangers constitutional principles like the indivisibility of the Republic and the uniqueness of the French people32. Despite of the fact that Article 75-1 of the same constitution acknowledges: “Regional languages are part of the French heritage

But François Hollande plans to submit it again to the Parliament and the Congress in 2016.

9.2.1 CO-OFFICIAL LANGUAGES Currently, there are no languages recognised as co-official languages. This does by far not mean that there are none. A 1999 report identified 75 languages that would qualify as regional or minority language under the European charter. Here an overview map of the main one in Metropolitan France:

30 Source: , consulted August 2015 31 Jack Lang, statement 2001, then Minister of Education 32 “La charte mettrait en cause les principes d’indivisibilité de la République et d’unicité du peuple français” http://www.lemonde.fr/societe/article/2015/08/01/les-langues-regionales-bientot-reconnues-par-la- constitution_4707451_3224.html#Sq0Se0BdLKY9FpXZ.99 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 30 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

FIGURE 3 LANGAUGES OF FRANCE (SOURCE: WIKIPEDIA)

9.3 RELEVANT ORGANISATIONS  The “Délégation générale à la langue française et aux langues de France (DGLFLF)” of the Ministry of Culture and Communication that is in charge of language issues is in charge to coordinate language policies. In 2001, the part of the title “et aux langues de France” was added in order to mark the government’s acknowledgement of the linguistic diversity of France. As stated on its website33, its mission is, amongst others: “to work closely with the economic, social, professionals and scientific sectors, as well as with a large number of associations to fight together for a better understanding of the language issue in public policies”.  The Agence universitaire de la Francophonie (AUF) is an international association comprising universities, grandes écoles, academic networks and scientific research centres that use the French language all over the world. With a network of 804 members in 102 countries, it is one of the world’s largest higher education and research associations.  Observatoire Européen du Plurilingualism (http://www.observatoireplurilinguisme.eu). The European Observatory for Plurilingualism is a French initiative that, among other activities, issues a Charter for Plurilingualism. One other objective is to promote the use of French.  Association pour la sauvegarde et l'expansion de la langue française (http://www.asselaf.fr/ ), is an association founded in 1990 to promote the teaching and usage of French in French-speaking countries and abroad.

33 http://www.culturecommunication.gouv.fr/Politiques-ministerielles/Langue-francaise-et-langues-de-France/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 31 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Association des fonctionnaires français des organisations internationales (http://www.affoimonde.org/), the Association of French Civil Servants in international organisations, take lead in linguistic strategies that centre around the French language.  Organisation internationale de la francophonie (OIF - http://www.francophonie.org) is an organisation that is based on plurilingualism with French as “shared language” among French-speaking countries.

9.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES PAST LANGUAGE STRATEGIES AND POLICIES France introduced in 1994 a law for the preservation of the French language against the “invasion” of Anglicisms (Loi Toubon). Public texts, TV or advertising are not allowed to use words of English origin. The DGLFLF (see above) substitutes English words with French ones, e.g. “courriel” (Courrier électronique) instead of e-mail.

Following a report to the Prime Minister in 2000 concerning the major role of HLT in the Information Society, the Techno-langue programme has been launched in April 2002 as a large French national programme on Language Technologies that lasted until 2006. The website with the financed projects is still available: http://www.technolangue.net/rubrique.php3?id_rubrique=24

Specific results, e.g. from the CESART project that enabled to carry out a campaign for the evaluation of terminology extraction tools (monolingual French), are available at the ELRA website: http://catalog.elra.info/product_info.php?products_id=993

In 2006, the French government commissioned a study on “Language Technologies in Europe”34. In its conclusion, the study states: “Several factors incite decision-makers to integrate innovative solutions into their company to intelligently manage digital content: - the digital convergence of computerisation and information and communication technologies ; - the integration of multimedia content ; - high-speed, wireless Internet ; - web 2.0 and web 3.0 applications that promote editorial and social contributions of users-subscribers of contents and services. The progressive use of ICT leaves us to predict that the language tools market will open towards the general public. The need is felt to take marketing actions to optimize the appropriateness of the supply and demand.” *Emphasis added+. Alas, neither marketing actions followed suit, nor did France continue with the Technolangue initiative. However, there is an attempt to revive it in “Technolangue II”, see below.

9.4.1 CURRENT NATIONAL POLICIES AND STRATEGIES A special web section of the Ministry for Culture and Communication is dedicated to “the French languages and languages of France” (http://www.culturecommunication.gouv.fr/Politiques-ministerielles/Langue-francaise-et-

34 http://www.technolangue.net/IMG/pdf/SyntheseUKEtudeMarche-Technolangue2006.pdf 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 32 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY langues-de-France/). It is run by the DGFLFL (see above) to “coordinate and animate the language policy of the government”.

The recent Guide des bonnes pratiques linguistiques dans les entreprises35 issued by DGLFLF is geared at companies in France and working at international level to reconcile the use of French with the need of a global communication.

The same DGFLFL issued in 2014 a summary paper on “Digital technologies at the service of languages”36 where it points out short-term and long-term initiatives:

Existing:  Crowdsourcing to enrich the French language: http://wikilf.culture.fr/  The JocondeLab project where ca. 300.000 artworks are described in 14 languages: http://jocondelab.iri- research.org/jocondelab

Short-term:  SémanticPédia initiative: https://fr.wikipedia.org/wiki/S%C3%A9manticp%C3%A9dia a Collaboration of the Ministry for Culture and Communication, INRIA and Wikimédia France.  A one-day workshop (cooperation CNRS and Ministry of Research) as a stepping-stone for a big national programme to support language Tools developments for French and the languages of France

Medium- to long-term:  Further enlargement of the SémanticPédia experience  Creation of a “Technolangue II” that could be financed by Programme d’Investissements d’Avenir (PIA).

Furthermore, closer cooperation with the EU is envisaged, in particular in the areas of culture (EUROPEANA), learning (DG EAC) and language technologies!

35 http://www.culturecommunication.gouv.fr/Politiques-ministerielles/Langue-francaise-et-langues-de-France/Politiques-de-la-langue/Guide-des-bonnes- pratiques-linguistiques-dans-les-entreprises 36 http://www.culturecommunication.gouv.fr/content/download/103702/1221063/version/1/file/Le%20numerique%20au%20service%20de%20la%20politiqu e%20linguistique.pdf 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 33 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

10. GERMANY 10.1 BACKGROUND 10.1.1 COUNTRY CHARACTERISTICS The Federal Republic of Germany (“Deutsche Bundesrepublik”) includes 16 constituent states and covers an area of 357,021 square kilometres in Western/Central Europe. Founding member of the EU (then EEC), it is with 81 million inhabitants the most populous member state in the European Union. After the , it is the second most popular migration destination in the world. Federalism is one of the constitutional principles of Germany. According to the German constitution (called Grundgesetz or in English Basic Law), some topics, such as foreign affairs and defence, are the exclusive responsibility of the federation (i.e. the federal level), while others fall under the shared authority of the states and the federation; the states retain residual legislative authority for all other areas, including "culture", which in Germany includes also most forms of education and job training37. Languages fall under the competence of the “Länder” which makes a coherent approach towards them difficult. The only time when the federal government influenced a matter of language was in 1998 with its reform of the German language (“Rechtschreibreform”).

10.2 LANGUAGES 10.2.1 OFFICIAL The official language (“Amtssprache”) of Germany is German. However, this is not laid down in the constitution but only in federal/regional laws. Initiatives to change the constitution by including German as official language failed so far. German dialects are widely spoken but Standard German (“Hochdeutsch”) is the language taught at schools.

FIGURE 4 MAP OF GERMAN DIALECTS (SOURCE: WIKIPEDIA)

37 Source: Wikipedia 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 34 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

10.2.2 CO-OFFICIAL LANGUAGES Germany has no co-official languages but recognises some regional first languages: Sorbian 0.10%, Romani 0.08%, Danish 0.06% and North Frisian 0.01%. The most important immigrant language is Turkish, with 1.8% speakers due to huge waves of migrant guest workers that came to Germany from the 1950s to the 1970s and remained there. The recognition of English as an official language is frequently discussed in the German public but so far, to no avail.

10.3 RELEVANT ORGANISATIONS 10.3.1 LANGUAGE TECHNOLOGY The German Society for Computational Linguistics and Language Technology (GSCL - http://www.gscl.org/index- en.html) is the scientific association for the research, teaching and professional work in natural language processing. It supports the cooperation with neighbouring disciplines (e.g., linguistics and semiotics, computer science and mathematics, psychology and cognitive science information science) and keeps contact to the respective associations.

German Competence Center in Speech and Language Technology at the German Research Center for Artificial Intelligence –DFKI www.dfki.de

Universities with specialised studies in language technologies information technologies (Informationswissenschaft und Sprachtechnologie, non exhaustive list): Centre for linguistics and language technologies at the university of Saarbrücken; Justus-Liebig-Universität Gießen, Heinrich-Heine Universität Düsseldorf, University Duisburg/Essen Universities of: Hildesheim, Bielefeld, Darmstadt, Stuttgart, Trier, Siegen; Technische Hochschule Cologne.

10.3.2 LANGUAGE IN GENERAL  Goethe-Institut (https://www.goethe.de/en/ ) Worldwide institute for the promotion and teaching of the German language  Gesellschaft für deutsche Sprache (DfdS - http://www.gfds.de ) – Society for the German language is a not- for-profit association for the protection and research of the German language  Institut für deutsche Sprache (IDS - http://www.ids-mannheim.de) Institute for the German Language: This Mannheim institute researches the German language and its use and recent history.  Verein Deutsche Sprache (VDS) http://www.vds-ev.de/ ) Not-for-profit association that protects and promotes the German language. Main driver to include the German language into the constitution.

10.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 35 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

10.4.1 CURRENT NATIONAL POLICIES AND STRATEGIES Languages are in the competence of the Laender (constituent states) and, apart from the above-mentioned reform of the German language (“Rechtschreibreform”), no initiativeregarding languages at federal level was ever carried out. At regional level, language policies refer mainly to school education. Fromm 38000 school in Germany, only 200 are bilingual and these are mainly German-English or German-French. Neighbouring languages or immigrant languages are hardly taught.

With the current immigration wave, the voices for German as obligatory language are becoming louder again. Some Laender foresee obligatory German-learning, other do not. There is no coherent approach.

Despite of the lack of public language strategies (or because of it), the LT scene in terms of SMEs (e.g. in and around Berlin) and university courses in computational linguistics and language and information technology are increasing.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 36 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

11. LUXEMBOURG 11.1 BACKGROUND 11.1.1 COUNTRY CHARACTERISTICS The Grand-Duchy of Luxembourg is one of the smallest countries in Europe and is part of the founding members of the European Union. It is located in the northern part of Europe and is surrounded by Belgium, France and Germany. The population of Luxembourg is about 549 680 and covers an area of 2568 km2. The country is divided in two regions which one is called Ardennes and the second one is called Bon Pays. The nature is enriched by mountains and rivers which the area is mostly well known about. After the Second World War Luxembourg started to seek for independence in regard of Germany and decided to have a close economic cooperation with Belgium. It is important to emphasize that Luxembourg is one of the most powerful countries in Europe in terms of the economy and this is demonstrated with the highest percentage of the GDP in Europe and holds the second place for income per capita in the world. On the other hand, Luxembourg together with the Netherlands and also with Belgium they were the founders of the Customs Union in the European Union. An interesting fact is that the largest minority group (13%) of the population is from Portugal. This flow of migrants occurred between 1970 and 1997 during the rise of the standard of living, when Luxembourg started to increase the demand for labor. Based on the new dates the population has increased approximately by one percent comparing with the previous years38 due to migration. It is with importance to emphasize that Luxembourg because of the location it has a great influence of the German and French culture.

11.2 LANGUAGES 11.2.1 OFFICIAL Luxembourg has a relatively unusual language situation where the use of languages is not included in the constitution but in regular laws. The Language Law of 1984 decrees that “the official language of Luxembourgers shall be Luxembourgian” (Art.1). As such, Luxembourg is one of 2 EU Member States where an official language is not an official language of the EU (the other being Turkish spoken in Cyprus). Subsequent articles of the Language Law decree that legislative acts shall be done in French and that in “contentious or non-contentious administrative or judicial matters, French, German of Luxembourgian may be used. Between 2000 and 2002, the linguist Jerome Lulling developed a lexical database of 185,000 word forms for the very first Luxembourgish spellchecker, thus launching the computerization of the Luxembourgish language. The origins of Luxembourg’s specific linguistic situation are closely related with the country’s history. In 963 Count Sigfrid acquired the remains of a Roman” Castellum” and called it Lucilinburhuc which later on was called Luxembourg39 . The region was later inherited by the Holy Roman Empire and the language that was used was German. After the World War II Luxembourg started developing and had a successful cooperation with Belgium it’s the Netherlands (“BENELUX” union) and increased its influence around Europe

38 http://www.everyculture.com/Ja-Ma/Luxembourg.html 39 https://www.coe.int/t/dg4/education/minlang/Report/PeriodicalReports/LuxembourgPR1_en.pdf 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 37 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

11.2.2 CO-OFFICIAL LANGUAGES Luxembourg has no regional or minority languages but it ratified the European Charter for Regional and Minority Languages. As already mentioned there is a minority of Portuguese people in Luxembourg which covers 13% of the population, followed by Italians and French citizens that moved there during 1970. Luxembourg is very well known for its diversity of the languages and based on the Eurobarometer 90 percent of the society can speak more than two languages40.

11.3 RELEVANT ORGANISATIONS  Office Luxembourgoies de L’Accueil et de L’Integration (OLAI)41 : This organization is sponsored by the Government as a tool within the framework of facilitating the integration of the foreigners which are legal citizens in Luxembourg while offering training languages course and citizenship courses in Luxembourgish which is the national language.  Actioun Lëtzebuergesch42 : The purpose of this non-profit organization was to promote the Luxembourgish language and to speak for everything that is related with it. This organization contributed to the establishment of the Law of the Luxembourgish as the first language in 1984.  In Lingua43: They are two languages schools which are specialized in providing language courses and offer the opportunity to learn one of the official languages giving a great importance to the Luxembourgish language.  Centre de Langues Luxembourg44: It is a portal that offers free online courses in Luxembourgish which is developed by the Quattropole city in a network in cooperation with the University of Luxembourg and the Grand Duchy’s Ministry of National Education45. Is a public establishment center which is offering language courses in Luxembourgish also in one of the two other official language languages .  Univeristy of Luxmbourg46 : At the university of Luxembourg is establishes a Research center on Multilingualism which carries out different researches in regard to Luxembourgish languages and also the other two official languages.  Prolingua Language Sa47: It is the first language Center in Luxembourg which was established in 1983.  National Languages Institute: It is a public establishment center which is offering language courses in Luxembourgish also in one of the three other official . 48

It is interesting to note that the University of Sheffield49 (UK) has established a Center for Luxembourgish language Studies. Since 1999 this university offers a master programme in Luxembourgish studies.

40 http://ec.europa.eu/public_opinion/archives/ebs/ebs_386_en.pdf 41 http://www.olai.public.lu/en/accueil-integration/mesures/contrat-accueil/ 42 http://www.actioun-letzebuergesch.lu/home.html 43 http://www.inlingua.lu/ 44 http://www.insl.lu/ 45 http://www.luxembourg.public.lu/en/etudier/apprendre-luxembourgeois/cours-langue-lux/index.html 46 http://wwwen.uni.lu/recherche/flshase/education_culture_cognition_and_society_eccs/research_institutes/research_on_multilingualism_mling 47 http://www.prolingua.lu/ 48 http://www.insl.lu/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 38 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

11.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 11.4.1 PAST LANGUAGE STRATEGIES AND POLICIES In 1984, Luxembourgian (or Luxembourgish) was “officialised” as national language (but without further explanation what “national” entails50). French is de facto used by the Administration and Legislation (the latter is based on the French Code Napoléon).

In 2005 Luxembourg approved a law implementing the European Charter for Regional or Minority languages51. Comparing with other countries which are in the borders of Luxembourg this country follows another language policy, meaning that it recognizes the minority languages and wants to promote and foster the diversity within the country. Different waves of migrants during the 19th and 20th century have contributed to the diffusion of multilingualism in Luxembourg. This is also linked to the geographic situation of Luxembourg.

49 https://www.shef.ac.uk/luxembourg-studies 50 https://www.abdn.ac.uk/pfrlsu/documents/Redinger,%20Language%20Planning%20and%20Policy%20on%20Linguistic%20Boundaries.pdf 51 https://www.coe.int/t/dg4/education/minlang/Report/PeriodicalReports/LuxembourgPR1_en.pdf

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 39 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

12. HUNGARY 12.1 BACKGROUND 12.1.1 COUNTRY CHARACTERISTICS Hungary is a country in Central Europe neighbouring both Slavic (Slovak Republic, Slovenia, Ukraine, Serbia, Croatia) and non-Slavic (Austria, Romania) countries. It is a landlocked country with an area of 93,030 km2. The population of the country was 9.94 million according to 2011 census. The capital and largest city is Budapest. The foundation of Hungary was laid in the 9th century in the conquest of the Carpathian Basin. The language of the medieval state was at first, followed by German under the Habsburg rule, and Hungarian becoming the language of the public administration in the 19th century.

The current borders of Hungary were first established by the Treaty of Trianon after the World War I. After the World War II, Hungary joined the Warsaw Pact. 1989 Hungary became again a democratic parliamentary republic. It entered the European Union in 2004.

12.2 LANGUAGES According to 2011 census, 9.83 million people spoke Hungarian as a first language (99% of the population)52. Hungarian is a Uralic language and is one of the few European languages that don’t belong to the Indo-European . Hungarian is also spoken in seven neighbour countries and in emigrant communities.

According to the Fundamental Law of Hungary nationalities living in Hungary “have the right to use their mother tongue, to use names in their own languages individually and collectively, to nurture their own cultures, and to receive education in their mother tongues.”53

In 1995 the government ratified the European Charter for Regional or Minority Languages in respect to Croatian, German, Greek, Romanian, Serbian, Slovak, and Slovenian.

English (16%) and German (11%) are the most widely spoken foreign languages54.

12.2.1 OFFICIAL The official language of Hungary is Hungarian. The Fundamental Law of Hungary (adopted in 2011), declares Hungarian as the official language of Hungary55.

52 http://www.ksh.hu/nepszamlalas/tablak_teruleti_00 (accessed 10 August 2015). 53 http://www.kormany.hu/download/e/02/00000/The%20New%20Fundamental%20Law%20of%20Hungary.pdf (retrieved 20 August 21 August 2015). 54 http://www.ksh.hu/docs/hun/xftp/idoszaki/nepsz2011/nepsz_orsz_2011.pdf (retrieved 10 August 2015). 55 http://www.kormany.hu/download/e/02/00000/The%20New%20Fundamental%20Law%20of%20Hungary.pdf (retrieved 20 August 21 August 2015). 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 40 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

12.3 RELEVANT ORGANISATIONS Magyar Nyelvstratégiai Intézet (Hungarian Language Strategy Institute)56: Established in 2014, overseen by the Prime Minister’s Office. Magyar Tudományos Akadémia Nyelvtudományi Intézet (Research Institute for Linguistics of the Hungarian Academy of Sciences)57: The tasks of the Institute include theoretical and applied research in general linguistic issues, as well as in Hungarian linguistics, Uralic studies, and . It also conducts an on-going compilation of the comprehensive dictionary of the Hungarian languages. Further tasks include the assembly of linguistic corpora and databases. The Institute investigates different variants of Hungarian and minority languages in Hungary, as well as issues in language policy. The Department of Language Technology prepared the first version of the Hungarian National Corpus. Balassi Intézet (Balassi Institute)58: The main roles of the Institute are developing and attaining Hungary’s objectives in the area of cultural diplomacy. It was launched to promote Hungarian language culture, analogously to British Council and Goethe Institute.  Magyar Nyelvtudományi Társaság (Society of Hungarian Linguistics)59  Nemzetközi Magyarságtudományi Társaság (International Association for Hungarian Studies)60  Philological Faculty at the Eötvös Loránd University61  Pannon Universität Veszprém62  Philological Faculty at the University of Pécs63 University of Szeged, Department of Informatics64:Several research projects together with the Department of Language Technology, Research Institute for Linguistics of the Hungarian Academy of Sciences, and MorphoLogic.  Budapest University of Technology and Economics, Faculty of Electrical Engineering and Informatics65 MorphoLogic66: A private R&D company. Various projects on machine translation and related topics. Kilgray67: Kilgray was established in 2004. It is specialized in computer-assisted translation solutions, but has also extensive research experience on the fields of the ergonomy and quality control of translation work, language-independent segmentation, similarity search and indexing algorithms.  Nyelv- és Beszédtechnológiai Platform (Platform for Language and Speech Technology)68: The Platform for Language and Speech Technology is a cooperative group of 8 industrial and research partners for the

56 http://manysi.hu 57 http://www.nytud.hu/eng/index.html 58 http://www.balassiintezet.hu/en/ 59 http://mnyt.hu/ 60 http://hungarologia.net/en 61 https://www.elte.hu/en 62 http://www.uni-pannon.hu/ 63 http://pte.hu/english 64 https://www.inf.u-szeged.hu/en 65 http://www.tmit.bme.hu/?language=en 66 http://www.morphologic.hu/index.php?lang=en 67 https://www.memoq.com 68 http://www.hlt-platform.hu/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 41 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

representation of interests. Its goals are to facilitate the development of Hungarian language and speech technology, the use of tools already developed; and to present strategic goals for language and speech technology as an independent industrial branch of the future.

Several organisations are active in the field of terminology: Terminológiai (Szaknyelvi) Bizottság (Terminology institute - http://www.tankonyvtar.hu ); Magyar Nyelv Terminológiai Tanácsa – MaTT – a terminology institution but in Hungarian only (http://www.matt.hu/ ); Magyar Szabványügyi Testület – Hungarian Standards Institution (http://www.mszt.hu ); TermDok - Pécsi Tudományegyetem Terminológiai Dokumentációs Központja – University Library of Pecs and centre of Learning, website in HU, DE and EN (http://lib.pte.hu); and TERMIK – Károli Gáspár Református Egyetem, Terminológiai Kutatócsoport69 Terminology Research Group (http://alknyelvport.nytud.hu/muhelyek/termik).

12.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 12.4.1 CURRENT NATIONAL POLICIES AND STRATEGIES Hungary has not had an official language policy. However, in 2014 the new Hungarian Language Strategy Institute was set up under the Prime Minister's Office by the Government Decree 55/2014 (III.4). In the Decree, 14 tasks of the Institute were stated, inter alia: to establish and monitor a medium-term strategy for the Hungarian language; to research the language’s structure, characteristics, and functioning; to coordinate and conduct research into terminology; to participate in the formulation of principles of supporting Hungarian-language databases; to give expert opinions in questions of language policy for public administration and public media; to maintain the richness of language; and to implement findings in public education.70 The Decree foresees the activities of the Institute in a national and international scope.

In general, language policy initiatives and programmes have been largely financed through the Ministry for Culture and Education (e.g. kindergartens, schools, school books, scholarships, etc.). The projects and initiatives related to language technology and linguistic resources for Hungarian have been supported and funded nationally (for example by research organization, through bilateral agreements between organisations, or by Hungarian Scientific Research Fund) and internationally (for example European Union FP7 programme, LLP programme, etc.).

Language technology initiatives and projects focusing on linguistic resources have been carried out by the Department of Language Technology at the Research Institute for Linguistics of the Hungarian Academy of Sciences). It has participated in several international projects which were aiming to adopt certain processes developed for western European languages and now considered part of the standard for the analysis of Hungarian (Multext-East, Gramlex) and to develop new standards of creating linguistic resources. The researchers

69 Judit Muráth, University of Pécs, Faculty of Business and Economics 70 Terminology Documentation Centre, August 2015. 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 42 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY at the department have played an active role in adapting computerized language processing systems and technologies to the needs of Hungarian. the Hungarian National Corpus, a reference corpus of present-day Hungarian, which reflects written use and now consists of 187 million words from language variants form , Subcarpathia, Transylvania and Vojvodina also, has recently been completed as the result of joint work of the Hungarian Language Offices and the Department of Corpus Linguistics. Further projects by the department related include inter alia: Construction of the Hungarian WordNet Ontology and its Application in Information Extraction Systems (together with University of Szeged, Department of Informatics and MorphoLogic), and Hungarian-English Machine Translation System (together with University of Szeged, Department of Informatics and MorphoLogic). The department participated in several EU-funded project, such as Central and South-East European Resources (CESAR)71, Internet Translators for all European Languages (iTranslate4)72. Some of the current projects73 include Slovak-Hungarian parallel corpus, online spelling consultation portal74, etc.

71 http://www.cesar-project.net/ 72 http://itranslate4.eu/project/ 73 For a comprehensive list of project see http://www.nytud.hu/depts/corpus/projektek.html. 74 http://helyesiras.mta.hu/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 43 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

13. IRELAND 13.1 BACKGROUND 13.1.1 COUNTRY CHARACTERISTICS Ireland is an island in the North Atlantic on the western periphery of Europe. It is the twentieth-largest island on Earth.

Politically, Ireland is divided between the Republic of Ireland (officially named Ireland), which covers five-sixths of the island, and Northern Ireland (part of the United Kingdom), which covers the remaining area and is located in the north-east of the island. The country is divided into 4 Provinces (Ulster, Munster, Leinster and Connacht) and then into 32 Counties, 26 of which lie south of the border with the remaining 6 in Northern Ireland. Dublin is the capital city with a population of approximately 1.1 Million, followed in terms of population by Cork (198,582), Limerick (91,454), Galway (76,778) and Waterford (51,519). According to the 2011 Census of Population, the population of the Republic of Ireland was just over 4.588 million.

13.1.2 LANGUAGES OVERVIEW SITUATION There are a number of languages used in Ireland (English, Irish, Ulster Scots or immigrant languages such as Greek, Polish, Lithuanian, Latvian, Spanish, Cantonese, Japanese, Mandarin, Hindi, Urdu, Punjabi and Arabic). Irish Gaelic is the only language to have originated from within the island, while others have been introduced through foreign settlements. Since the late nineteenth century, English has been the predominant first language. A large minority claims some ability to use Irish but it is the first language for a small percentage of population. Irish became an official and working language of the European Union on 1 January 2007.

13.2 LANGUAGES 13.2.1 OFFICIAL Under Article 8 of the 1937 Constitution of Ireland, Irish is the state’s first official language with English also having official status.

13.3 RELEVANT ORGANISATIONS The main state organisations in Ireland promoting the are:  The Department of Arts, Heritage and Gaeltacht (http://www.ahg.gov.ie) - Government Dept. with responsibility for language matters;  Foras na Gaeilge (http://www.gaeilge.ie/ ) - National language body; and,  Údarás na Gaeltacht (www.udadas.ie) – the state body responsibility for the linguistic, economic and cultural development of the Gaeltacht

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 44 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

13.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 13.4.1 PAST & CURRENT NATIONAL POLICIES AND STRATEGIES The objective of Government policy in relation to the Irish as detailed in the 20-Year Strategy for the Irish Language 2010-2030 is to increase the use and knowledge of Irish as a community language. Specifically, the Government’s aim is to ensure that as many citizens as possible are bilingual in both Irish and English. It is an integral component of the Government’s Irish language policy that close attention be given to its place in the Gaeltacht, particularly in light of research which indicates that the language’s viability as a household and community language in the Gaeltacht is under threat.

The aim of Government policy is also to:  increase the number of families throughout the country who use Irish as the daily language of communication;  provide linguistic support for the Gaeltacht as an Irish-speaking community and to recognise the issues which arise in areas where Irish is the household and community language;  ensure that in public discourse and in public services the use of Irish or English will be, as far as practical, a choice for the citizen to make and that over time more and more people throughout the State will choose to do their business in Irish; and  ensure that Irish becomes more visible in our society, both as a spoken language by the citizens and also in areas such as signage and literature.

The Strategy sets out areas of action under nine key headings: 1. Education 2. The Gaeltacht 3. Family Transmission of the Language – Early Intervention 4. Administration, Services and Community 5. Media and Technology 6. Dictionaries 7. Legislation and Status 8. Economic Life 9. Cross-cutting Initiatives

It is in 5) “Media and Technology” where language technologies are addressed. It is stated that an IT strategy will be developed, to include IT terminology and lexicographical resources; localisation and open source applications; switch ability of interface and language attributes; additional content creation aids to supplement spellcheckers and computerised dictionaries; markers; multilingual web pages; terminology for computer-aided translation; multilingual content/document management systems; language technology issues and corpora; speech technology, speech synthesis, speech recognition, adaptive technology and embedding issues; capacity

building for end users and technology specialists; e-learning and the Irish language; call centre software; back end 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 45 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY databases and bi/multilingualism; metadata; mobile devices; optical character recognition; and handwriting recognition.

Such IT developments need also to be embedded in educational, social and work-related practices to become effective means of enhanced communication.

PUBLIC SERVICES RELATED TO AUTOMATIC TRANSLATION In 2014, the Department of Arts, Heritage and the Gaeltacht (DAHG) entered into an Agreement with the Centre for Global Intelligent Content at Dublin City University aimed at developing a statistic-based machine translation (MT) system for use by the Department’s staff translators. Following from the success of this pilot project, a new Agreement has been entered into with the same organisation for the system to go fully operational within the Department and for it to be further refined during 2015. In tandem with this, the Department is funding research by Trinity College Dublin into the development of a rule-based MT System with the medium-term aim of amalgamating the two systems in order to achieve maximum efficiencies. The development of these systems forms an integral part of an overall plan to establish a shared Irish Language translation service for the Irish civil Service.

PROJECTS AND INITIATIVES There are also several projects and initiatives that have been carried out in Ireland with regard to LT and MT and which have counted with the support of public administration:  abair.ie originated from the university research project Cabóigín I. The project's goal was to develop a full- fledged Text-to-Speech synthesis system for Irish and was funded by Foras na Gaeilge. The first synthetic that was developed was called Cabóigín, an Ulster Irish voice (Gweedore). The person in charge of the project is Prof Ailbhe Ní Chasaide from the Phonetics and Speech Laboratory, School of Linguistic, Speech and Communication Sciences (CLCS), Trinity College, Dublin. Cabóigín I built on foundational research done in the WISPR (Welsh and Irish Speech Processing Resources) project. During WISPR a speech corpus of Ulster Irish was built up that would allow the development an Irish synthesiser. The project was funded by the European Union under the INTERREG IIIA programme. The research was conducted in cooperation with Bangor University in Wales and with researchers from Dublin City University, University College Dublin and the Linguistics Institute of Ireland. Cabógaí II extended the work of Cabóigín I, also with funding from Foras na Gaeilge. During the project we improved the system developed under Cabógaí I and developed a Connaught voice (Ráth Chairn). We also started the development of a voice for Munster Irish (Dingle Peninsula). The current project ABAIR is funded by the Department of Arts, Heritage and the Gaeltacht which enables the ABAIR team to continue developing state-of-the-art synthesis systems featuring a variety of different voices and dialects. Munster Irish is the next (major) dialect that will be available on this website. (http://www.abair.tcd.ie/?page=background&lang=eng )

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 46 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Foclóir na Nua-Ghaeilge http://www.focloir.ie/ English-Irish Dictionary, launched in January 2013. The dictionary is available free of charge, and has been adapted to work both on desktop computers and on mobile devices. As well as translations for the English content, the dictionary also contains grammatical information and sound files. (http://www.ria.ie/research/focloir-na-nua-ghaeilge.aspx )  Tearma This is the National Terminology Database for Irish, developed by Fiontar, DCU in collaboration with An Coiste Téarmaíochta, Foras na Gaeilge.http://www.tearma.ie/Home.aspx

13.4.2 FUTURE NATIONAL POLICIES AND STRATEGIES While DAHG and Foras na Gaeilge are currently funding a number of different Irish language technology projects in areas such as lexicography, parallel corpora, terminology, voice synthesis and machine translation, it is now recognised that a long term plan is required in order to properly develop the sector.

This has led to the current situation, where the Department of Arts, Heritage and Gaeltacht is, in consultation with the 3rd level sector, in the process of drafting a 10 Year Digital Plan for the Irish Language , publication of which is scheduled for late Autumn 2015. The specialist areas proposed to be examined include:  Digital Documentation & Linguistic Analysis of the Written and Spoken Dialects;  Prosody and Timing/Rhythm Modelling for Speech Technology;  Syntactic, Lexical and Semantic Resources;  Natural Language Understanding;  Speech Synthesis and Text-to-Speech;  Speech Recognition;  Dialogue Systems;  Machine Translation;  Information Retrieval;  Speech and Language Applications for State and Public use;  Educational applications and CALL;

Technological applications for access and disability; and, Synergy with indigenous and international companies.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 47 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

14. ITALY 14.1 BACKGROUND Italy is like France, Germany and the BENELUX countries a founding member of the EU, then EEC.

14.1.1 COUNTRY CHARACTERISTICS Italy is a boot-shaped country situated in , surrounded on the west by the Tyrrhenian Sea and on the east by the Adriatic. It is bounded by France to the West, and Austria to the North and Slovenia to the East. The Apennine Mountains form the peninsula's backbone; the Alps form its northern boundary. Several islands form part of Italy; the largest are Sicily and Sardinia.

The Republic of Italy is formed of 20 autonomous regions, 5 of which (Sardinia, Sicily, Trentino-Alto Adige/Südtirol, Aosta Valley and Friuli-Venezia Giulia) enjoy a “special status” according to Article 116 of the Italian Constitution 1948 that gives them significantly broader legislative, administrative and financial autonomy. Each region, except for the Aosta Valley, is divided into provinces.  Centre: Lazio, Marche, Tuscany, Umbria  North-East: Emilia Romagna, Friuli Venezia Giulia, Trentino-Alto Adige, Veneto  North-West: Aosta Valley, Liguria, Lombardy, Piedmont  Islands: Sardinia, Sicily  South: Abruzzo, Apulia, Basilicata, Calabria, Campania, Molise

Rome is the capital city and also the country's largest and most populated commune. The Metropolitan City of Rome has a population of 4.3 million residents. Italy has 60,808,000 inhabitants (ISTAT, 1st January 2015).

14.1.2 LANGUAGES OVERVIEW SITUATION Due to its long history of strongly independent regional identities, until its relatively recent unification in 1861, Italy has kept a wide variety of regional languages75, spoken to varying degrees, some of which, have gained official recognition.

The official and most widely spoken language is Italian, Romance language descendant of Tuscan. It is mainly spoken in Europe: Italy, Switzerland, San Marino, Vatican City, as a second language in Albania, Malta, Slovenia and Croatia, by minorities in Crimea, Eritrea, France, Libya, Monaco, Montenegro, Romania and Somalia, and by expatriate communities in Europe, in the Americas and in Australia. Many speakers are native bilinguals of both standardised Italian and other regional languages.

75 http://www.yourguidetoitaly.com/regional-languages-dialects.html 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 48 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

According to the Bologna statistics of the EU, Italian is spoken as a native language by 65 million people in the EU (13% of the EU population), mainly in Italy, and as a second language by 14 million (3%). The total number of speakers is around 85 million, being the fourth most frequently taught foreign language in the world.

The largest group of non-Italian speakers (around 1.6 million) are the ones who speak Sardinian (Sardo, Sardu) – romance language. Four dialects can be distinguished: Gallurese Sardinian, , and Sassarese Sardinian.

Another large community of some 600.000 people in Friuli region speak Friulian, a Rhaeto-romance language, spoken in the Udine Province, extending to Gorizia and the Venezia provinces. It is sometimes called Eastern Ladin, since it shares the same roots as Ladin, although over the centuries it has diverged under the influence of surrounding languages, of speakers Friulian has no official status nationally or regionally.

Further regional languages include:  Cimbrian (Tzimbro, Zimbrisch), a language of west-Germanic origin spoken in the towns of Giazza (Glietzen, Ljetzen), Roana (Ramab), and Lusern in Sette and Tredici Communi South of Trent Province.  Italkian (judeo-italian), mainly spoken in urban areas in Rome and in central and northern Italy.  Piedmontese, a language with considerable French influence distinct enough from standard Italian to be considered a separate language spoken in Piedmont (Nord-West Italy), expect for the Provençal – and Franco Provençal – speaking Alpine valleys. It is also spoken in Australia and the USA.  Ladin (ladino), a Rhaeto-Romance language, spoken by 35,000 Italians living in the Dolomites Mountains, in the Trentino-South Tyrol region and in the Veneto region.  Ligurian, a language closer to Piedmonts, Lombard and French than to standard Italian.  Lombard, very different from standard Italian. A group of dialects (Milanese, Bergamaso, etc.), some of which are separate languages. Western Lombard dialects (of Ticino and Graubünden) are inherently intelligible to each other’s speakers. Speakers in more conservative valleys use some kind of “standard” dialect to communicate with speakers of other dialects of Lombard.

14.2 LANGUAGES 14.2.1 OFFICIAL76 Italian is the official language in Italy (Statutory national language (1999, Law No. 482, Article 1.1)). It is the most widely spoken language in the country where almost all media (television, newspapers, movies, etc.) are produced in Italian.

76 http://www.queensu.ca/mcp/minoritynations/evidence/Italy.html 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 49 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

14.2.2 CO-OFFICIAL LANGUAGES All languages are protected by national law - 482/1999 ‘Norme in materia di tutela delle minoranze linguistiche storiche’, adopted on 15 December 1999 and published in the Gazzetta Ufficiale Della Repubblica italiana n. 297 on 20 December 1999. The law also makes a distinction between those who are considered minority groups (Albanians, , Germanic peoples indigenous to Italy, Greeks, Slovenes and Croats) and those who are not (all the others).

These 12 officially recognised languages are: French (120,000 speakers), Occitan (50,000 speakers), Franco- Provençal (70,000 speakers), German (295,000 speakers), Ladin (28,000 speakers), Friulian (526,000 speakers), Slovene (85,000 speakers), Sardinian (175,000 speakers), Catalan (18,000 speakers), Arberesh (a variant of contemporary Albanian) (100,000 speakers), Greek (3,900 speakers) and Croatian (1,700 speakers).

FIGURE 5 LINGUSTIC MAP ITALY

Other languages are co-official within certain regions: French in Val d’Aosta77, German in Trentino-Alto Adige78, Slovene is co-official in some of the provinces of Trieste and Gorizia, and Sardinian in Sardinia. Ladin municipalities of South Tyrol are trilingual (Italian, Ladin, and German).

77 Statute of Aosta Valley Article 38 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 50 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

14.3 RELEVANT ORGANISATIONS The main state organisations promoting Italian are:  Ministero degli Affari Esteri e della Cooperazione Internazionale - Ministry of Foreign Affairs (http://www.esteri.it/mae/en/politica_estera/cultura/promozionelinguaitaliana/default.html/ ): In the context of the ministry’s cultural policy the diffusion of the is an area of priority commitment: . The Directorate General for Cultural Promotion and Cooperation made the strategic decision to intensify its commitment on behalf of the diffusion of the Italian language, making use of a network of institutions (Italian Cultural Institutes and Italian language and culture courses designed for communities of Italians and people of Italian descent abroad). . The Directorate General for the Country Promotion (economy, culture and science) has among its action lines the Promotion of Italian language and publishing abroad . Directorate General for Management and information and communications technology (ICT)

 Accademia della Crusca http://www.accademiadellacrusca.it is the most important centre of scientific research dedicated to the study and promotion of the Italian language. Its main goal is to spread historical knowledge of the Italian language, and of its present evolution in the framework of interlinguistic exchanges in the contemporary world, in Italian society – especially schools –, and abroad.

Other relevant institutions related to the promotion and support of technology and economic development, where language technologies can be included. These are:  Ministero dell'Istruzione, dell'Università e della Ricerca (MIUR) Ministry of Education, University and Research http://www.istruzione.it/ is the main player in Research and Innovation (R&I) in charge of coordinating national and international scientific activities, supervising the academic system, funding universities and research agencies, and supporting public and private research and technological development. It coordinates the preparation of the National Research Programme (PNR) in consultation with other Ministries, Regions and other stakeholders. . National Research Council (CNR) is the largest public research organisation under the supervision of MIUR. The National Agency for New Technologies, Energy and Sustainable Development (ENEA) has the mission to develop R&D on energy and environmental fields.

 Ministerio dello Svilupo Economico (MISE) www.sviluppoeconomico.gov.it Ministry of Economic Development manages industrial innovation. . The Department for Competitiveness is in charge of technological innovation and responsible for industrial policy, industrial districts, energy policies, policies for SMEs, and instruments to support the production system.

78 Statute of South Tyrol Articles 19, 99-102 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 51 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

. The Department of development and social cohesion (DPS) is in charge of the planning, coordination and management and the structural funds

 Agenzia per l'Italia Digitale (AgID), public agency established by in 2012, pursuing the highest level of technological innovation in the organization and development of public administration and in the service of citizens and businesses. It is in charge of the Italian Digital Agenda (IDA) under the control of the Prime Minister’s office.

 Inter Ministry Committee for Economic Planning (CIPE) has the role of coordinating science and technology policy - focusing on medium and long term actions.

 Other Ministries (Health, Agriculture, Defence, etc) manage research funds in their specific fields. Regions, under the concurrency principle, develop local initiatives in R&I and contribute to policy making on R&D; in some cases, research organisations are funded and managed by Regions.

 Italian Association for Artificial Intelligence (AI*IA) http://www.aixia.it/

Finally, it is worth mentioning that Italy has more than 15 research labs working in the Human Language Technologies (HTL) field, including:  CNR–Istituto di Linguistica Computazionale – Pisa (ILC-CNR),  CNR – Istituto per le scienze della cognizione (Institute for Cognitive Sciences) – Sezione di Padova (phonetics, speech technologies),  Eulogos,  Scuola Normale Superiore di Pisa – Centro per la fonetica sperimentale (Centre for Experimental Phonetics),  Istituto Trentino di Cultura - Centro per la Ricerca Scientifica e Tecnologica (ITC IRST),  Centro Ricerche Fiat,  IBM Research Centre, Synthema  Fondazione Ugo Bordoni,  Università degli Studi di Ancona,  Università di Bari - Sistemi di Elab. dell'Informazione,  Universita degli Studi di Firenze,  Università degli Studi di Genova,  Università degli Studi di Napoli – CIRASS,  Universitòdegli Studi di Roma 3 Tor Vergata,  Università degli Studi di Torino - Dipartimento di informatica,  Università degli Studi di Udine,  Universita degli Studi di Venezia Cà Foscari- Laboratorio di Linguistica Computazionale, 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 52 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Università degli Studi di Verona.  Fondazione Bruso Kessler, Trento

14.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES In Italy neither LT strategies nor policies have been defined so far. The 2001 constitutional reform transformed the system of government and the distribution of powers. The state now has competence in a limited number of areas (including foreign relations, immigration, social security and some general provisions on education) whereas regions have legislative powers in all matters that are not explicitly covered by state legislation79, such as language promotion.

In fact, the situation overall the country is that there were, and there are, laws and regulations that support the promotion, use and learning of Italian and regional languages in the corresponding regions and abroad, being regional governments the ones that publish public calls for funding translations or generation of contents to their official languages. As far as LT is concerned, the support to development of new products, solutions based on language technologies are included in programmes/policies (either regional or national) that support ICT and/or generic research and innovation projects.

14.4.1 PAST LANGUAGE STRATEGIES AND POLICIES (IF RELEVANT) As mentioned, the country is split into different regions and provinces and, according to the Constitution (Article 6), the Republic is committed to the promotion of local autonomies and protects linguistic minorities with special legislation. Several national and regional laws were issued in the past decades to safeguard the autochthonous minority languages, most notably in the autonomous border regions.  In this respect, the most far reaching special legislation actually requiring bilingual qualifications for public servants has been the so-called "pacchetto Alto Adige", adopted in 1971 for the autonomous province of Bolzano, where the majority of the population belongs, in fact, to the German-speaking minority.  More recently, a comprehensive law for the safeguarding of the so-called Historic Linguistic Minorities (Law 482/1999) was adopted, aiming at the protection «of the languages and culture of the Albanians, Catalans, Germans, Greeks, Slovenians and Croatians, as well as of those speaking French, Friulan, Ladin, Occitan and Sardinian». The law established a National Fund for the Safeguard of Linguistic Minorities at the Prime Minister's Office, providing for the teaching of the above mentioned minority languages and cultural traditions, and for their use in official acts at the national, regional and local level. Furthermore, the law requires the service to safeguard historic minority languages via "Public Service Contracts", under the supervision of the Authority for Guarantees in Communication.

79 http://www.queensu.ca/mcp/national-minorities/evidence/italy 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 53 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

On the other hand, the priorities and areas of intervention in Italy are determined as provided by the reform of the National Research System (Legislative Decree no. 204/1998), at government level. From 1998, the Italian government guaranteed a programmatic orientation for research. The key mechanism of this orientation was the National Research Programme (1998-2013), a strategic document formulated by the Ministry of Education, Universities and Research (MIUR) and approved by the Inter Ministry Committee for Economic Planning (CIPE). The main challenges facing the research system are:  Insufficient resources for Higher Education.  Low share of skilled human capital.  Low R&D intensity and specialization of firms.  Size distribution of firms.  Increasing territorial inequalities.

The main priority areas defined in this Programme are:  Environment  Energy  Food  Cultural heritage  Security  ICT  Sustainable mobility  Health and science  Augmented sensitivity

Therefore, although LT can be approached in some of the priority areas above mentioned (those in bold), they are not considered as a specific priority.

14.4.1.1 PROJECTS AND INITIATIVES Italy was not among the larger EU countries that launched large-scale HLT programmes during the late 1980s and early 1990s. However, some projects and initiatives have been set up in the area.

In 1997, HLT was designated a national research policy, with the launch of two three-year projects:  TAL – a national framework for developing language resources,  LRCMM, devoted to mono and multilingual research in computational linguistics, with a view to strengthening innovation in this field.

There are also some initiatives at inrnational level:  Forum for HLT in Italy,

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 54 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 HLT Network,  The National Project in Natural Language Processing.

However, the vast majority of projects in the area have counted with the regional support. This is the case of the following ones, which were funded in the Friuli Region:  GDB TF - Grant Dizionari Bilengâl Talian Furlan: is a bilingual electronic dictionary from the Italian language to the . The GDB TF almost contains 62.000 Italian headwords (46.500 headwords made up by one word and 15.500 made up by multiword) and 63.500 Friulian headwords (45.000 one word and 18.500 multiword): for each Italian headword, there are listed the different meanings that the word can have, the phraseology as example, the synonyms and the contrary ones both Italian and Friulian ones, separated by meaning. Very advanced functions of search of the headwords correspond to this completeness in the definitions. Besides the GDB TF allows to see the of almost all the Friulian headwords which are inside. The software is furnished with open source licence and it is available for different operating systems: Linux, Macintosh and Windows.  DOF – Dizionari Ortografic Furlan: Friulian language spelling dictionary http://www.arlef.it/struments/grant- dizionari-talian-furlan  COF – Coretôr Ortografic Furlan: Friulian language spell checker http://www.arlef.it/struments/coretor- ortografic-furlan

14.4.2 CURRENT & FUTURE NATIONAL POLICIES AND STRATEGIES During the last 15-20 years, Italy remained at the edge of the debate on Community policies for research and innovation. However, Italy aims at being more present and being protagonist in the coming future. In this context, in March 2013, the MIUR presented the Horizon2020 Italy (HIT 2020), the first document defining the research and innovation strategy for the next seven years that aligns Italian and European research.

HIT2020 provides a more integrated approach not only because it intends to optimize inter-institutional cooperation between the MIUR and MISE, directly involved in the planning policies to support research and innovation, but because it aims to encourage involvement in policy making of departments and agencies with responsibilities and authority not primarily linked to these functions.

Together with HIT2020, the Agenzia per l'Italia Digitale (AgID), promoted the Strategia per la crescita digitale 2014-2020, that aims at directing the technological choices in the field of ICT in different areas (infrastructure, learning, security, health, tourism, agriculture, smart city & communities, open data, e-business, e- administration…) in the coming years. However, among the priorities LT are not specially mentioned. However, priorities like e-Tourism or Open Data will need some Language technologies for their implementation.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 55 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Therefore, the research carried out has revealed that at national level there is neither strategy to deal with language or minority languages, nor linguistic policies. It is a duty of the Government of each region to develop programmes or projects to deal with this issue. (i.e. PIANO GENERALE DI POLITICA LINGUISTICA PER LA LINGUA FRIULANA DI CUI ALL’ART. 25 DELLA LEGGE REGIONALE 18 DICEMBRE 2007, N. 29 2015-2019)

Given that languages (promotion, learning, content development, etc.) are not being considered as a priority for the whole country, it seems reasonable to assume that there will not be a policy/strategy developed in a near future.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 56 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

15. LATVIA 15.1 BACKGROUND 15.1.1 COUNTRY CHARACTERISTICS Latvia is a democratic parliamentary republic established in 1918. The capital city is Riga, the European Capital of Culture 2014. Latvian is the official language. Latvia is a unitary state, divided into 118 administrative divisions, of which 109 are municipalities and 9 are cities. The country is a member of NATO, the European Union, the United Nations, the Council of Europe, CBSS, the IMF, NB8, NIB, OSCE and WTO. It is currently in the accession process for joining the OECD. For 2013, Latvia was listed 48th on the Human Development Index and as a high income country until 1 July 2014.

Latvia has a population of 2.1 million. The country has a generally high level of education. Russian is a majority language spoken by a minority, whereas the are minority language spoken by a majority (i.e. a majority language that needs the kind of protection usually necessary for the threatened minority languages).

15.2 LANGUAGES 15.2.1 OFFICIAL According to Article 4 of the Constitution of 1922, revised and revitalized “The is the official language in the Republic of Latvia”. On April 30, 2002, as a part of the so-called "language amendments" to the Constitution, Article 18 was supplemented with the provision that every Member of Parliament is obliged to swear or to give a promise "to be loyal towards Latvia, strengthen its sovereignty and the Latvian language as the sole State language”.

Latvian is the national language, but it is also an “endangered” language due to the small size of its native user population. The Valsts Valoda Likums or State Language Law of 1992 allowed, even for limited cases, the use of other historic languages of Latvia, such as Russian and German, while the only official language was Latvian. Latvian is the native language of at least 1.5 m speakers in the world. Plus at least half a million people use Latvian in parallel to their native language. However, due to a low birth rate, the number of Latvian native speakers is decreasing by approximately 5,000 (0.3%) every year. However, it should be mentioned that more than 30% of Latvia are Russian native speakers.

The Latvian language has had a standardised literary language from the 16th century. More than 200 newspapers are printed in Latvian with the total circulation of 100 million annually, plus more than 300 magazines and approximately 2,500 book titles. Five TV channels (three of them also using ) and around 20 radio stations broadcast in Latvian. Currently the Latvian language is ranked the 150th largest language in the world.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 57 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

15.2.2 CO-OFFICIAL LANGUAGES In 2012, Latvia held a referendum on making Russian the second state language. A majority of the voted against. However, many Russian speaking citizens (30+ % of the Latvian population) were in favour of it80.

FIGURE 6 LATVIAN ETHNIC GROUPS: RED ARE RUSSIAN SPEAKERS, BLUE IS WHITE RUSSIAN

15.3 RELEVANT ORGANISATIONS The Latvian State Language Centre (Latvian: Valsts valodas centrs) is the regulator for the Latvian language, created in 1992 and based in Riga. It is entitled to serve fines for violations in the field of language use. Since 2009, the Centre of Terminology and Translation has been part of the State Language Centre.

The Latvian Language Agency is a direct administration institution supervised by the Minister of Education and Science with the aim of enhancing the status and promoting the sustainable development of Latvian as the official state language of the Republic of Latvia and an official language of the European Union. It was founded on July 1, 2009, after the reorganization of the State Language Agency and the National Agency for the Latvian Language Training. It implements the official language policy, formulated in the Guidelines of the State Language Policy for 2015-2020 (available here in Latvian and here in English).

The Latvian government has been running programmes for teaching Latvian since the 1990’s. By 2008, about 93% of Latvian minorities (Russians, Estonians, Germans, Polish, etc.) acknowledged having some Latvian language skills. The Latvian government provides bilingual education in eight minority languages: Belarussian, Estonian, Hebrew, Lithuanian, Polish, Romani, Russian, and Ukrainian81. In these bilingual schools or classes Latvian is taught as a second language to provide command of Latvian and promote social integration. The Latvian language

80 http://www.euroviews.eu/2014/2014/04/13/russian-speakers-protest-in-riga-for-preservation-of-their-language/#sthash.qjSK46EN.dpuf 81 Source: http://latviaspb.ru/en/policy/4641/4642/4643/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 58 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY complements learners’ native language in increasing proportion both as a second language and as a means of instruction, thus ensuring proficiency in both languages at a high level82.

Research into the Latvian Language is carried out at the Latvian Language Institute, the University of Latvia and the universities of Liepāja and Daugavpils. Latvian is taught and studied in several universities throughout the world, such as the University of Washington. The standardisation and codification of Latvian is carried out by the Latvian Language Expert Commission of the State Language Centre.

A listing of Latvian language resources for translators from the EU is available here83.

15.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 15.4.1 PAST LANGUAGE STRATEGIES AND POLICIES 1989-90 Latvian was given a special boost as the national or ‘Republican’ language. All government, administrative and public contact personnel had to speak Latvian (and/or Estonian) with a time limit on proving their competence.

1991 The Language Laws were amended as the transition from Russian came to an end. This was followed by a massive programme of language attestation for all public contact occupations. Latvian became the sole language of government, public administration (although Russian could be used informally), and higher education.

The Language Law of 1999 states its aims in Article 1:  the preservation, protection and development of the Latvian language;  the preservation of the cultural and historical heritage of the Latvian nation;  the right to use the Latvian language freely in any sphere of life in the whole territory of Latvia;  the integration of national minorities into Latvian society while respecting their right to use their mother tongue or any other language;  the increasing influence of Latvian in the cultural environment of Latvia by promoting faster social integration.

Between 1999 and today, Latvia has undertaken a number of strategic projects aimed at providing the country with a basic infrastructure for language technology in a digital age. In 2004 the Development of the Latvian Language Corpus was initiated by the State Language Commission. This has led to the creation of three corpora including a Corpus of Transcripts of the Saeima Sessions. Work on thee corpora is ongoing. It is presumed that this corps work will help provide a powerful resource base for developing further language technology solutions for Latvian in the years ahead.

82 Source: https://metranet.londonmet.ac.uk/fms/MRSite/Research/cice/pubs/2012/2012_145.pdf 83 http://ec.europa.eu/translation/latvian/latvian_en.htm 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 59 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

2005-2010 The government developed a series of Guidelines of the State Language Policy

2006-2010 saw the launch of The State Language Policy Programme. Under this programme, the National Library of Latvia has been creating the “Letonica” Latvian Digital Library.

In 2009, the government launched the Language Shore initiative to create stronger partnerships between government, academia and industry to develop an LT expertise hub.

In 2012 Latvian voters overwhelmingly rejected a proposal to give official status to Russian.

15.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES Current language policy is directed at protecting the languages of Latvia’s minorities while maintaining social cohesion through enforcing Latvian as the official language. Today 9 out of every 10 inhabitants speak Latvian.

There are two key tasks for the language policy of Latvia since the adoption of the Language Law (1989; amendments in 1992):  create a mechanism for ensuring the competitiveness of the Latvian language and its priority in the highest sociolinguistic functions, as well as for protecting the linguistic human rights of the speaker of the Latvian language,  guarantee the opportunity to preserve, develop and use the languages of minorities of Latvia for certain functions.

The National Programme for Latvian Language Training (NPLLT) is designed to promote the learning of Latvian as a second language for minorities in Latvia. The “Society Integration Programme” is oriented towards the Latvian language as the means of social integration. The state language policy programme is being developed to guarantee the linguistic quality and competitiveness of Latvian in all areas of science, business, education and development.

In terms of digital technology, Latvia is involved in a number of internal and European wide programmes to gradually build the kind of infrastructure the language will need to maintain a sustainable existence on the Web and more generally in a digital economy.

NEEDS Latvia today needs a fully-fledged national programme in language technology and standardised resource development, networked with all interested parties in Europe. For example there is still no Latvian WordNet or

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 60 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Latvian , two technology resources that would strongly drive further research by applying standards in language technology.

It also needs to train more computational linguists to build expertise in universities.

It must also ensure that Latvian is a digitally supported language with a full presence on the web within the European Digital Single Market, and equipped with appropriate technological support to provide Latvian with a global presence. This covers text technologies (writing systems, parsers, text corpora, etc.) and speech technologies (for eventual speech recognition and synthesis services in medical, legal and other professions), as well as for Latvian citizens using smartphone apps, etc.

It must also constantly create, maintain and make available standardised Latvian terminology for all areas of national interest (technical, administrative and social media). This is a continuous, long-term process that requires human resources and data collection/analytic resources. It is important that Member States can monitor, understand and share the linguistic performance of their population as generations succeed each other.

CURRENT ACTIONS Digital Government The Latvian government has an affirmative information access policy, which in the case of language involves making automated translation services available for citizens. Notably, the on-going Hugo translation service (developed by the local company ) translates between Latvian and English.

Citizenship and Language On 9 May 2013, the Saeima (Parliament) adopted Amendments to the Citizenship Law (hereinafter – Amendments). A specific paragraph of the Amendments deals with the Latvian language naturalisation test and exemptions therefrom. As a result of the Amendments, former military personnel of USSR (Russia) who opted to remain living in Latvia after the breakup of the Soviet Union now have the possibility to acquire Latvian citizenship by completing the naturalization procedure.

Latvia continues to develop and finance its liberal education model – the state finances national minority education programmes in seven languages: Russian, Polish, Hebrew, Belarusian, Ukrainian, Estonian, and Lithuanian. Currently (academic year 2014/2015), the state finances 109 schools in one of the afore-mentioned languages and 65 schools that have both Latvian and minority language programmes. Secondary schools are entitled to determine which subjects are taught in Latvian, but the total should be 60% of all subjects. Primary schools have the option of choosing from five national minority education models, one of which allows schools to devise their own unique educational model.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 61 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

16. LITHUANIA 16.1 BACKGROUND 16.1.1 COUNTRY CHARACTERISTICS On 11 March 1990, a year before formal break-up of the Soviet Union, Lithuania became the first Soviet republic to declare it independent, resulting in the restoration of an independent State of Lithuania.

Lithuania is a member of the European Union, the Council of Europe, a full member of the Schengen Agreement and NATO. It is also a member of the Nordic Investment Bank, and part of Nordic-Baltic cooperation of Northern European countries. The United Nations Human Development Index lists Lithuania as a "very high human development" country. Lithuania has been among the fastest growing economies in the European Union and is ranked 24th in the world in the Ease of Doing Business Index.

16.2 LANGUAGES 16.2.1 OFFICIAL Lithuanian (lietuvių kalba) is the official state language of Lithuania and is recognized as one of the official languages of the European Union. There are about 2.9 million native Lithuanian speakers in Lithuania and about 200,000 abroad. Lithuanian is a Baltic language, closely related to Latvian, which are partially mutually intelligible, and written in a Latin alphabet. The is often said to be the most conservative living Indo- European language, retaining many features of Proto-Indo-European now lost in other Indo-European languages. In Lithuania, more than 90 % of non-Lithuanian speakers have fair to good proficiency in Lithuanian.

16.2.2 CO-OFFICIAL LANGUAGES The largest minority languages are Russian and Polish, spoken natively by 8,2% and 5,8% of population respectively, but they have no official status84.

FIGURE 7 LITHUANIA’S MAP

84 Source: http://www.truelithuania.com/topics/culture-of-lithuania/languages-in-lithuania 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 62 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

16.3 RELEVANT ORGANISATIONS The State Commission of the Lithuanian Language (Lithuanian: Valstybinė lietuvių kalbos komisija) is the official language regulating body of the Lithuanian language. The Language Commission went into operation in 1961 as a non-governmental entity under the auspices of the Lithuanian Academy of Sciences. It is now a state-run institution, founded under the auspices of the Seimas (parliament) of Lithuania. The mandate of this Commission comprises not only regulation and standardisation of the language, but also implementation of the official language status. Commission decrees on linguistic issues are compulsory by law to all companies, agencies, institutions, and the media in Lithuania.

The State Commission of the Lithuanian Language – a State institution, accountable to the Seimas.

The Seimas appoints and dismisses the members of the Language Commission upon the recommendation of the Committee on Education, Science and Culture. Universities, scientific research institutions and creative unions submit proposals to the Committee on Education, Science and Culture of the Seimas.

The Language Commission is comprised of 17 members. The members of the Language Commission are appointed for a five-year-term. The number of their terms of powers shall be unlimited. A chairman is a head of the Commission. The meetings of the Language Commission are held at least once a month. The decisions are accepted as passed, if at least 2/3 of the Commission’s attending members have voted in favour thereof.

The decisions of the Language Commission are obligatory for State and municipal institutions, all of the offices, enterprises and organisations operating in the Republic of Lithuania.

The Language Commission shall decide issues concerning the implementation of the Law on the State Language, establishes the directions of regulating the Lithuanian language, decides the issues of standardisation and codification of Lithuanian language; appraises and approves the most important standardising language works (dictionaries, reference books, guidebooks and textbooks), etc.

The Secretariat of the Language Commission provides for the needs of the Commission and activities of its experts and subcommittees, etc.

A listing of Lithuanian language resources for translators from the EU is available here85.

85 http://ec.europa.eu/translation/lithuanian/lithuanian_en.htm 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 63 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

16.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 16.4.1 PAST LANGUAGE STRATEGIES AND POLICIES (IF RELEVANT) The Law on the State Language was promulgated in 1995. Article 14 states that Lithuanian is the State language. Article 117 states that court trials shall be conducted in the State language. Non-Lithuanian speakers can use an interpreter.

The government passed a “LAW ON TERM BANK” on 23 December 2003. The Lithuanian Term Bank was then created by term bank managers - the State Commission of the Lithuanian Language (hereinafter referred to as the “Language Commission”) and the Chancellery of the Seimas of the Republic of Lithuania – who shall also ensure its operation, continued maintenance and updating. The Chancellery of the Seimas of the Republic of Lithuania ensures the provision or development as well as the operation, maintenance and upgrading of the hardware and software necessary to operate the Term Bank.

16.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES The Lithuanian language has been included in the legislation on protecting the cultural and ethnic heritage as part of the country’s cultural identity. The Programme for the Expansion of the Lithuanian Information Society 2011– 2019 includes a strategic goal of improving the quality of living for the Lithuanian people and the condition of the corporate environment when it comes to using digital possibilities and to ensure that at least 85% of the Lithuanian population have Internet access by 2019.

The government of Lithuania has set two objectives:  digitalising the objects of Lithuanian cultural heritage and using them as a basis for developing digital products available to the public, thus ensuring the conservation and dissemination of digital content online;  integrating digital products of the Lithuanian language with the Internet to ensure the full-scale functioning of the Lithuanian language in both its written and spoken form across all aspects of the life of the nation.

NEEDS The key need is to overcome the fragmented state of digital language resources and lack of coordination in technology development. There is still no Lithuanian WordNet or thesaurus in terms of digital semantic tools. Furthermore, there is no adequate Lithuanian grammar or available geared towards innovative language technology development.

There is a need for standardized tools and resources. The available language resources that could be used as a basis to build language technology have been developed by separate institutions, groups of researchers or businesses that did not always follow the generally accepted standards, and therefore their compatibility with global language technology value chain is somewhat limited or economically not viable. The resources will have to

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 64 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY be recompiled to conform to new standards. There are also very few online tools for end users, consumers, and commercial developers.

CURRENT ACTIONS The Lithuanian government is committed to ensuring the expansion of language technology, as demonstrated by the programmes funded by various governmental institutions and the European Union structural funds, which are aimed at designing, and improving language technology and ipso facto relevant digital language resources. However, due to the small market of users of language technology and tools, fragmented infrastructure of research and studies, the lack of clear priorities and coordination, there is little initiative in private business to build resources.

There are several projects in progress in Lithuania aimed at applying the international standards to the older resources (e.g. the Corpus of Modern Lithuanian) or designing new products.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 65 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

17. NETHERLANDS 17.1 BACKGROUND The Netherlands is a small country, located in the north-west of Europe. It is bordered in the east by Germany, in the south by Belgium and in the west by the North Sea. With over 16.5 million people and a population density of 488 people per km2, the Netherlands is the most densely populated country of the European Union and one of the mostly densely populated countries in the world. The total size of the Netherlands is 41,500 km2. Amsterdam is the capital, but the government resides in The Hague. More than 40% of the total population in the Randstad, the agglomeration of the cities of Amsterdam, Rotterdam, The Hague and Utrecht.

Water dominates the Dutch landscape. Three big European rivers (Rhine, Meuse and Scheldt) reach the ocean via the Netherlands and create an important delta. 26% of the Netherlands is under sea level. During a age-long battle against the water, the Dutch constructed a water system consisting of dykes, polders and weirs. However, the Netherlands offers more variation than the familiar green, flat polder landscape with black and white cows.

The Netherlands has a long tradition of consultation and cooperation of government bodies, stakeholder organizations, and citizens. Within this framework, policy on national and international issues is prepared by central government and forms the basis for legislation ratified by the Dutch Parliament. Policy related to the provinces and municipalities is devolved to government at these levels, closer to the people and on the principle of promoting public participation in democracy. Close cooperation between all levels of government inherent in the Dutch system ensures the necessary checks and balances.

17.2 LANGUAGES 17.2.1 OFFICIAL The official national language of the Netherlands is Dutch, spoken by almost all people in the Netherlands. Dutch is also spoken and official in Aruba, Brussels, Curaçao, Flanders, Sint Maarten and Suriname. It is a West Germanic, language that originated in the Early (c. 470) and was standardized in the 16th century. With about 23 million native speakers, Dutch is the common spoken and written language of the majority of the population in the Netherlands. Apart from Frisian (see below), there are several immigrant languages, but no reliable figures are available.

17.2.2 CO-OFFICIAL LANGUAGES 17.2.2.1 FRISIAN There is one co-official language in the Netherlands: Frisian. Frisian is spoken in the province of by 453,000 speakers.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 66 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

17.2.2.2 ENGLISH English is an official language in the special municipalities of Saba and Sint Eustatius (BES Islands. Amsterdam also recognizes English as an official language but on a lower status than Dutch, meaning that communication with the municipality can be done in English, but Dutch remains the language of publications, meetings, and administration. A large majority of primary and secondary education in Amsterdam remains in Dutch only, but there are some bilingual Dutch-English schools. On Saba and St. Eustatius, the majority of the education is in English only, with some bilingual English-Dutch schools.

17.2.2.3 Papiamento is an official language in the special municipality of Bonaire.

DIALECTS Several dialects of Dutch Low Saxon (Nederlands Nedersaksisch in Dutch) are spoken in much of the north-east of the country and are recognized as regional languages according to the European Charter for Regional or Minority Languages. Low Saxon is spoken by 1,798,000 speakers86. Another Low Franconian dialect granted the status of regional language is , which is spoken in the south-eastern province of . Limburgish is spoken by 825,000 speakers. Though there are movements to have Limburgish recognized as an official language (meeting with varying amounts of success,) it is important to note that Limburgish in fact consists of a large number of differing dialects that share some common aspects, but are quite different. However, both Low Saxon and Limburgish spread across the Dutch-German border and belong to a common Dutch-German .

The Netherlands also has its separate , called Nederlandse Gebarentaal (NGT). It is still waiting for recognition and has 17,500 users.

86 https://en.wikipedia.org/wiki/Languages_of_the_Netherlands_-_cite_note-9 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 67 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

FIGURE 8 PROVINCES OF THE NETHERLANDS

17.3 RELEVANT ORGANISATIONS  The Dutch Language Union (Nederlandse Taalunie); an intergovernmental language policy organisation  Society of Our Language (Genootschap Onze Taal): a private initiative  General Dutch Union (Algemeen Nederlands Verbond): a private initiative  Institute for Dutch Lexicology (Instituut voor Nederlandse Lexicologie): study of Dutch language  Meertens Institute: study of Dutch language, its dialects and culture  Huygens ING Institute: study of Dutch literature and history

17.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 17.4.1 PAST LANGUAGE STRATEGIES AND POLICIES In 2004 a consortium of ministries and organizations in the Netherlands and Flanders launched the comprehensive Dutch-Flemish HLT programme STEVIN (a Dutch acronym for “Essential Speech and Language Technology Resources”). To guarantee its DutchFlemish character, this large-scale programme is carried out under the auspices of the intergovernmental Dutch Language Union (NTU). The aim of STEVIN was to contribute

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 68 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY to the further progress of HLT for the Dutch language, by raising awareness of HLT results, stimulating the demand of HLT products, promoting strategic research in HLT, and developing HLT resources that are essential and are known to be missing. A structure was set up for the management, maintenance and distribution of HLT resources. The STEVIN programme ran from 2004 to 2009, and resulted in many HLT activities in the Dutch language area, which were reported on at previous LREC conferences (2000, 2002, 2004).

This programme has y8ielded significant progress in the availability of basic resources for the Dutch language, some initial research and several end user applications.

17.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES Though some of the results of the projects mentioned above are exploited in industry and academia, e.g. through the CLARIN NL Project, and recently CLARIAH, it does not have the focus of attention of the government in the Netherlands, and there is no specific LT funding policy or programme.

NEEDS It is important that the activities started with the STEVIN programme are continued, so that the scientific and commercial opportunities are optimally taken advantage of. Presently there is a lack of continuity in research and development funding.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 69 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

18. POLAND 18.1 BACKGROUND 18.1.1 COUNTRY CHARACTERISTICS A nation with a proud cultural heritage, Poland can trace its roots back over 1,000 years. Positioned at the centre of Europe, it has known turbulent and violent times. There have been periods of independence as well as periods of domination by other countries. Several million people, half of them Jews, died in World War II. A new era began when Poland became an EU member in May 2004, five years after joining NATO and 15 years after the end of communist rule. It was the birthplace of the former Soviet bloc's first officially recognized independent mass political movement when strikes at the Gdansk shipyard in August 1980 led to agreement with the authorities on the establishment of the Solidarity trade union.

Poland has been a relatively stable democracy since the end of communist rule. The economy has boomed since EU accession in 2004, and Poland is one of the region's top-performing countries, although unemployment remains high. The governing coalition seeks deeper EU integration, eventual euro membership, although it faces a challenge from the Euro-sceptic Law and Justice Party. Poland is one of Europe's most pro-American countries. The shoots of political freedom were trampled again 16 months later when communist leader Wojciech Jaruzelski declared martial law. But the movement for change was irreversible. Elections in summer 1989 ushered in Eastern Europe's first post-communist government.

Poland has made major economic strides since the fall of communism, and especially since joining the EU. In 2009, when all the major European economies were contracting because of the credit crunch, Poland was the only country in Europe to experience economic growth. There has been marked success in creating a market economy and attracting foreign investment. Germany is now Poland's biggest trading partner. There was a massive movement of workers to Western Europe in the years after Poland joined the EU, but the exodus slowed down after the global economic crisis took hold.

Poland still has a huge farming sector - agriculture accounts for about 60% of the country's total land area - but the sector remains hampered by inefficiency, structural problems and lack of investment.

Warsaw's profile on the international stage was raised by its support for the US-led military campaigns in Iraq and Afghanistan. More recently, it has found itself close to the front line in Russia's military campaign against Ukraine after the fall of that country's pro-Moscow government in 2014.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 70 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

18.2 LANGUAGES 18.2.1 OFFICIAL Polish is the official language in Poland. With 38 million speakers in Poland, 2 million elsewhere in Europe and around 8 million native speakers outside Europe, the is one of the 10 most widely spoken languages in Europe and the 6th language of the EU according to the number of native speakers.

Together with Czech, Slovakian, Kashubian, Upper and Lower Sorabian and Polabian (a dead language) it belongs to the West-Slavic subgroup of the Indo-European language family.

18.2.2 CO-OFFICIAL LANGUAGES Generally speaking Poland is a monolinguistic country with dialects used only in rural areas. With two exceptions: Kashubian (has the status of a regional language) and Silesian Dialect/Language (the status is under discussion).

The language minority communities constitute no more than 3-4% of the citizens of Poland. The largest groups are the speakers of German, Byelorussian, Ukrainian, Lithuanian, Kashubia, Czech and Slovakian.

The governing bodies are legally obliged to support activities for the maintenance and development of the minority and regional languages.

FIGURE 9 LANGUAGES IN POLAND

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 71 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

18.3 RELEVANT ORGANISATIONS The Council for the Polish language is the authoritative institution that expresses opinions and gives advice on issues concerning the use of the Polish language. Every second year it presents a report on the protection of the Polish language to the Parliament.

18.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 18.4.1 PAST LANGUAGE STRATEGIES AND POLICIES Poland has quite a long history in language technology research and development. One of the earliest projects already dates back to 1967; the creation of the corpus of frequency dictionary of contemporary Polish by an interdisciplinary team of researchers from the University of Warsaw.

The early efforts included projects that aimed at the creation of a representative Polish morphological dictionary, such as POLEX (1993-1996) and later on the plWordNET (2008), building the first Polish wordnet, which is one of the biggest in the world. Another important project was the IPIPAN corpus (2000).

In the first decade of this century, the project National Corpus of Polish was started, with the goal to create the biggest Polish project, including the first Treebank for Polish.

18.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES Polish institutions continue to be active in LT research, and participate in national and European projects. They are involved in the ongoing CLARIN project and contribute to the efforts on the technological infrastructure for language resources and tools.

Poland has a number of excellent centres active in the field of language technology and computational linguistics. Currently at least 12 Polish universities and research centres are active in the field. Many of them offer courses in the field of language technology. Apart from the universities, major research projects are carried out by the language technology group of the Institute of Computer Sciences of the Polish Academy of Sciences.

NEEDS More financial means are necessary to support projects, aiming at developing more sophisticated LT, language corpora and other language resources. LT as a field of research also faces problems. Researchers are part of different communities; there is a need to bring them together, e.g. by organizing single conferences where all stakeholders can meet. Also, computational linguistics needs to acquire a more fixed place in the academic system.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 72 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

19. PORTUGAL 19.1 BACKGROUND 19.1.1 COUNTRY CHARACTERISTICS Portugal, officially the Portuguese Republic, is a country on the , in south-western Europe. It is the westernmost country of mainland Europe, being bordered by the Atlantic Ocean to the west and south and by Spain to the north and east. The country also holds sovereignty over the Atlantic archipelagos of the Azores and Madeira, both autonomous regions with their own regional governments.

Administratively, Portugal is divided into 308 municipalities (municípios or concelhos), which after a reform in 2013 are subdivided into 3,092 civil parishes (freguesia). Operationally, the municipality and civil parish, along with the national government, are the only legally identifiable local administrative units identified by the government of Portugal (i.e. cities, towns or villages have no standing in law, although may be used as catchment for the defining services). Continental Portugal is agglomerated into 18 districts (Aveiro, Beja, Braga, Bragança, Castelo Branco, Coimbra, Évora, Faro, Guarda, Leiria, Lisbon, Portalegre, Porto, Santarém, Setúbal, do Castelo, Vila Real and Viseu – each district takes the name of the district capital), while the archipelagos of the Azores and Madeira are governed as autonomous regions.

Currently, the Portuguese Republic has a population of around 10.427.301, with Lisbon, being the capital, the biggest city in the country.

19.1.2 LANGUAGES OVERVIEW SITUATION The primary language of the country is Portuguese, which originated in a territory corresponding to (N-W Spain) and the north of present-day Portugal. The Galician/ remained in use during the period of Arabic predominance and re-established itself as the principal language as its speakers moved southwards. Portuguese was instituted as the language of the court by King Dinis in 1297. Portuguese is now used as an official language in eight countries (Portugal, Angola, Brazil, Cape Verde, Guinea- Bissau, Mozambique, São Tomé and Príncipe and East Timor; the so-called CPLP countries) and a territory, Macau (Macau Special Administrative Region of the P.R. of China). The total number of speakers is estimated at around 220 million, of which 200 million are native speakers, spread over four continents: Africa, America, Asia and Europe87 88.

There are sizable groups of expatriate Portuguese speakers in various countries around the world, notably in France, Luxemburg, , the UK, Switzerland, US, Canada, and South Africa. 4.1% of the population of Portugal has non-Portuguese nationality (2006; OCDE).

87 Dados Estatísticos - Falantes de Português. Observatório da Língua Portuguesa. Internet, 25/01/2012 - http://observatorio-lp.sapo.pt/pt/dados- estatisticos/falantes-de-portugues. 88 M. Paul Lewis, editor. : Languages of the World. Sixteenth edition, 2009. Ethnologue. Internet,25/01/2012 - http://www.ethnologue.com 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 73 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

19.2 LANGUAGES 19.2.1 OFFICIAL The official language is Portuguese (article 11 – Constitution of the Portuguese Republic), the fifth most spoken European language in the world, with around 220 million speakers. Due to migratory movements 89 90, Portuguese is also spoken by communities in many countries, occupying in some of them an important position in the foreign population (for example, in Europe, Luxembourg (around 25% of the population), Andorra (around 11%), France, Germany, United Kingdom, Switzerland, Spain and Belgium 91.

19.2.2 CO-OFFICIAL LANGUAGES Portugal has one minority language, Mirandese, spoken and to some extent written in the north-eastern border town of Miranda do Douro (population of around 2,000) and in surrounding areas within Portugal by at most 10,000 persons; (almost) all of them being bilingual. Mirandese was recognised in 1999 as co-official with Portuguese for local matters. The Mirandese language belongs linguistically to the Asturian/Leonese group.

19.3 RELEVANT ORGANISATIONS The main state organisations in Portugal promoting languages:  International Institute of Portuguese language (Instituto Internacional da Língua Portuguesa – IILP): since 2002 the defence of Portuguese language and different Portuguese speaking cultures are its major goals.  Academy of Sciences of Lisbon contribute to the promotion of the Portuguese language, in particular with the publication of reference dictionaries: the Dictionary of Contemporary Portuguese  Instituto Camões: institution under the Portuguese Foreign Affairs Ministry responsible for the promotion of Portuguese language and culture abroad, officially founded in 1992.  Gulbenkian Foundation: established in 1956, the Foundation’s original purpose focused on fostering knowledge and raising the quality of life of persons throughout the fields of the arts, charity, science and education. It is engaged in the promotion of the Portuguese language.  Fundaçao para a Ciência e a Tecnologia (FCT) is the national funding agency supporting science, technology and innovation in all scientific domains, under responsibility of the Ministry for Education and Science. FCT started its activity in August 1997, replacing the National Board of Scientific and Technological Research (JNICT). Since March 2012 FCT coordinates public policies for the Information and Knowledge Society in Portugal, after the integration of the Knowledge Society Agency (UMIC). In October 2013 FCT took over the attributions and Responsibilities of the Foundation for National Scientific Computing (FCCN).

89 Demography and Population: International Migration Database. Organização para a Cooperação e o Desenvolvimento Económico - OCDE (Organisation for Economic Co-operation and Development - OECD). Internet, 25/01/2012 - http://stats.oecd.org 90 Observatório da Emigração. Internet, 25/01/2012 - http://www.observatorioemigracao.secomunidades.pt . 91 Comunidade Lusófona. Portugal em Linha. Internet, 25/01/2012 - http://www.portugal-linha.pt . 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 74 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

19.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 19.4.1 PAST & CURRENT NATIONAL POLICIES AND STRATEGIES Portugal emerges as a country that is profoundly aware of the status of its national tongue as the fifth most spoken language on earth, while also recognising the importance of (business) English for Portugal’s role in a globalised world.

The country promotes the regional language Mirandese, spoken by 0.1% of the national population, and has given constitutional protection to Portuguese Sign Language (LGP). However, even though the companies reflect a general tendency to favour the use of Portuguese, the importance of business English for interaction with foreign customers and companies abroad is also recognised. Other languages tend not to figure prominently, except for businesses with specific interests in particular foreign countries. Therefore, in general terms, the promotion of language competencies in the national language, in English as a lingua franca, or in other languages is generally not a priority. Multilingualism is not high on the agenda of Portuguese enterprises.

However, the Government is well aware of the importance of a coherent and structured international promotion of the Portuguese language and the economic value that will thus derive, since it corresponds to 17% of the GNP. The growing demand for Portuguese language is mainly due to economic and business reasons related to emerging markets such as Brazil and Angola. Consequently, Portugal currently takes part in organizations created to implement cultural and linguistic policies of the EU, such as intercultural dialogue and multilingualism. Through Instituto Camões, Portugal is one of the members of the European Union National Institutes for Culture (EUNIC) and the European Federation of National Institutions for Language (EFNIL), whose main objectives are to improve the European identity, which is linked to cultural and linguistic diversity, based on the principle “unity in diversity”.92

19.4.1.1 PROJECTS AND INITIATIVES Among others, the FCT supports the organisation of congresses, conferences and seminars on the Portuguese language and literature. It grants funding to specific research projects, as for instance, for the project Reference Corpus of Contemporary Portuguese or for the project Comprehensive Grammar of Portuguese, of the Centre of Linguistics of the University of Lisbon.

One of the most relevant initiatives related to LT, is the Portuguese Infrastructure Roadmap, supported by the FCT and which has led to the inclusion of Portugal as a member of CLARIN ERIC http://www.clarin.eu/news/portugal-joined-clarin-eric

92 Portugal and Cultural Diplomacy by Ana Filipa Teles 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 75 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

19.4.2 FUTURE NATIONAL POLICIES AND STRATEGIES 19.4.2.1 RESEARCH AND INNOVATION STRATEGY FOR 2014-2020 (ESTRATÉGIA DE I&I 2014-2020). The programming principles of the Research and Innovation Policy for the period 2014-2020 in Portugal are closely linked to the Partnership Agreement for European Regional and Investment Funds (ESIF) between Portugal and the EU (“Portugal 2020”). The Strategy for Research and Innovation for a Smart Specialisation (EI&I - Estratégia Nacional para uma Especialização Inteligente) is crucial for public funding of R&I in Portugal as it presents as ex ante conditionality of the Partnership Agreement to the investment priorities in research and innovation with the ESIF.

In response to the challenges identified in the diagnosis of Research and Innovation System - Challenges, strengths and weaknesses towards 2020, prepared by the FCT in 2013 -, they were defined five structuring objectives and themes that group the 15 Strategic Intelligent Priorities where Portugal shows competitive advantages existing or potential. They should be the basis for formulating strategic mobilization programs of policy measures and national programming instruments in the period 2014-2020:  CROSS TECHNOLOGIES AND APPLICATIONS: . Energy . Information and Communication Technologies . Raw materials and materials  INDUSTRY AND PRODUCTION TECHNOLOGIES . Production Technology and Product Industries . Manufacturing Technologies and Process Industries  MOBILITY , SPACE AND LOGISTICS . Automotive , Aeronautics and Space . Transport, Mobility and Logistics  NATURAL RESOURCES AND ENVIRONMENT . Agro -food . Forest . Economy of the Sea . Water and Environment  HEALTH, WELFARE AND TERRITORY . Cheers . Tourism . Cultural and Creative Industries . Habitat

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 76 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Therefore, ICT are vital to strengthen the national cohesion and sustainable development. The potential of ICT as a scientific / technological area and as economic / sociocultural activity sector represent a distinguishing factor of the national research and innovation system to enhance Portuguese competitiveness.

Finally, it is worth pointing out that the Ei&I is a multi-level strategy where the National Strategy for Research and Innovation (ENEI) comprises the national challenges and its alignment with the seven regional strategies (North, Central Region, Lisbon, Alentejo, Algarve, Azores and Madeira) (further information is available in https://www.fct.pt/suporte-politicas-IeD/estrategia2020/index.phtml.en ).

19.4.2.2 NATIONAL STRATEGY FOR RESEARCH AND INNOVATION (ENEI) The National Strategy for Research and Innovation (ENEI) results of a fruitful cooperation and pioneer between the Ministry of Economy and the Ministry of Education and Science, based in the Working Group ENEI composed by Agency for Competitiveness and Innovation (IAPMEI), IP, Foundation for Science and Technology (FCT), and supported by Innovation Agency (ADI) and the COMPETE Management Authority.

ENEI Identifies the big bets around which investment should preferably be directed in the period 2014-2020, maximizing the benefits of a coordinated intervention in different spaces with the National System for Research and Innovation (SI & I) interconnects. The vision is based on four fundamental pillars:  Digital Economy  Portugal as a country of Science and Creativeness  Reinforce Industrial technological capabilities  Differential Endogenous Resources valorisation

The diagnosis made, with the decisive contribution of the Stakeholders, identifies the potential areas where Portugal can be competitive. Among them is Technology of the Portuguese Language - Portuguese is a language with major global deployment and spoken in countries with strong growth, thus becoming a critical and decisive bet on potential crossover ICT identified with socio-economic and cultural-historical relevance of Portuguese language.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 77 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

20. ROMANIA 20.1 BACKGROUND 20.1.1 COUNTRY CHARACTERISTICS Romania is located in the southeastern Europe, in the contact zone between Central Europe, the Balkans, the Near East and the Slavic region. It is an area of multiple linguistic and cultural influences.

It borders Moldova, Ukraine, Hungary, Serbia, and Bulgaria. In the east, the Black Sea is a natural border. The river Danube, which is Europe's second longest river, empties in Romania's Danube Delta. The Carpathian Mountains cross Romania from the north to the southwest. With an area of 238,391 square kilometers, Romania is the twelfth-largest country in Europe. Bucharest is the capital and largest city of Romania, with a population of approximately 2 million people.

Romania emerged from the territories of the ancient Roman province of Dacia. Modern Romania was formed in 1859 through a personal union of the Danubian Principalities of Moldavia and Wallachia. It gained independence from the Ottoman Empire in the late 19th century with the establishment of the Kingdom of Romania. After the World War I, when Transylvania and Bessarabia ― formerly part of the Habsburg and Russian Empire respectively―were ceded to Romania, its population and territory doubled, and Romania became ethically and linguistically much more diverse93. The last king abdicated after the World War II and was replaced by the Communist regime. After the Cold War, Romania developed close relations to the West and joined NATO in 2004. It became a full member state of the European Union in 2007.

According to the data provided by the National Statistics Institute based on the latest census report Romania had a population of over 20 millions in 2011. Almost 90 per cent of the population identify themselves as native speakers of Romanian and as Eastern Orthodox Christians.

20.2 LANGUAGES 20.2.1 OFFICIAL Romanian is the official language as established by the Constitution94. Romanian is an Eastern Romance (Balkan Romance) language, together with Aromanian, Megleno-Romanian and Istro-Romanian.

Romanian is spoken by approximately 29 million people worldwide. The has an official status also in the Republic of Moldova and in the Autonomous Province of Vojvodina in the Republic of Serbia. Romanian is a recognized regional language in Ukraine. It is one of the official languages of the European Union.

93 Euromosaic, 2007. http://ec.europa.eu/languages/policy/language-policy/documents/euromosaic-romania_en.pdf 94 http://www.cdep.ro/pls/dic/site.page?den=act2_1&par1=1 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 78 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

The Romanian Constitution does not provide a definition of “minority”, it does, however recognize the existence of persons belonging to national minorities and guarantees them the right to preserve, develop and express their ethnic, cultural, linguistic and religious identity.

20.3 RELEVANT ORGANISATIONS  “Institutul de Lingvistică „Iorgu Iordan - Al. Rosetti" al Academiei Române (Institute for Linguistics of the Romanian Academy)95 The main objectives of the Institute are the cultivation and promotion of Romanian, research and documentation in the field of the Romanian language and linguistics, development of the research infrastructure, etc. The Institute also publishes a number of scientific journals in the field of the Romanian language and linguistics, and participates in national (Explanatory dictionary of Romanian, Etymological dictionary of Romanian, Romanian linguistic atlas, Toponymic Romanian dictionary, etc.) and international projects (, etc.).

 Research Institute for Artificial Intelligence “Mihai Draganescu” (RACAI) at the Romanian Academy96 The main research projects of RACAI are in the areas of natural language processing, machine learning and knowledge acquisition, computer-aided instruction and integrated modelling of information and geospatial technology. RACAI has been involved in a number of national97 (STAR – A System for Machine Translation for Romanian, ACCURAT-RO –Analysis and evaluation of Comparable Corpora for Under Resourced Areas of machine Translation, CLARIN – Interoperable Linguistic Resources Infrastructure for Romanian, etc.), and international projects98 (Multilingual Web, PARSEME – PARSing and Multi-word Expressions, METANET4U, MUMIA – Multilingual and multifaceted interactive information access, Flarenet, etc.) in the field of language technology, language resources, natural language processing, etc. In the scope of the project METANET4U, RACAI delivered several mono- and multilingual resources and tools for natural language processing on Romanian textual data through META-SHARE. The current priority project within the Romanian Academy is the creation of a reference electronic corpus of contemporary Romanian language (CoRoLa), i.e. a collection of (written and spoken) texts, annotated with metadata and with linguistic data. RACAI, together with the Institute of Computer Science have been collaborating on this project with a number of partners99.

 Romanian Academy Institute for Computer Science in Iași

95 http://www.lingv.ro/ 96 http://www.racai.ro/en 97 http://www.racai.ro/en/research-activities/national-projects/ 98 http://www.racai.ro/en/research-activities/international-projects/ 99 http://www.racai.ro/en/research-activities/corola-program-prioritar-al-academiei-romane/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 79 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

The Institute has conducted research in a number of topics in regard to language technology and language resources: sequential/parallel PROLOG compiler/interpreter, methods in computational linguistics, mono- and multi-lingual dictionaries providing morphological, lexical, and syntactic analyses, parallel algorithms for context-free grammar recognition, mechanisms for parallel recognition of languages100.

 Faculty of Computer Science of the Alexandru Ioan Cuza University of Iasi (UAIC)101 The research has been conducted in the scope of various projects on topics in natural language processing (sentiment/opinion analysis), summarization systems, improvement of the retrieval of learning material in e- learning by the means of multilingual language technology tools and semantic web techniques. UAIC has also been involved in the digitalization of the Thesaurus Dictionary of the Romanian Language.

 Faculty of Mathematics-Informatics of the Babeș-Bolyai University of Cluj-Napoca102 A number of research groups conducts investigations in the field of language technology, language processing, and related field. Some of the research areas are computational linguistics and methods of quantitative and qualitative analysis of texts, processing large corpus of text documents, and web technologies.

 Consortium for the Romanian Language: Resources & Tools (ConsILR) The Consortium for the Romanian Language: Resources & Tools (ConsILR: Consortiul de Informatizare pentru Limba Romana) is an initiative which aims to facilitate and augment the efforts of linguists and computer scientists working on Romanian language by promoting software tools and resources for linguistic processing. The ConsILR conference is a series of events organized yearly since 2001, and is aimed to promote the research on language resources and tools dedicated, with a special emphasis on Romanian103. Research into language processing, technology and resources for Romanian is also conducted at research institutions in in Republic of Moldova (Institute of Mathematics and Computer Science at the Academy of Sciences of the Republic of Moldova; Applied Informatics Department at the Faculty of Computers, Informatics and Microelectronics at the Technical University of the Republic of Moldova).

20.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES The major public authority in the field of language policies is the central government. One of the main issues regarding language in Romania is the protection of the languages of the 20 minority ethnic groups. Protective principles can be found in the Local Public Administration Law no 286/2006, including the right to use a mother tongue in administrative procedures, or the systematic translation of geographical names and indicators in all the

100 http://iit.academiaromana-is.ro/iit_tp.html 101 http://www.info.uaic.ro/bin/Main/ 102 http://www.cs.ubbcluj.ro/cercetare/grupuri-de-cercetare/ 103 http://consilr.info.uaic.ro/2015/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 80 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY spoken languages of a given area 104. In 2007 Romania adopted Law no 282 in order to ratify the European Charter for Regional or Minority Languages, which was signed in 1992.

In an attempt to counterbalance the pressure of English on Romanian, the Parliament adopted a Law on the use of Romanian in public places, circumstances and institutions (Law no 500/2004)104. The strong position of English in science and technology may be a cause for concern and the need for terminology work in the Romanian language in the field of science and technology is expressed in the Romania's development strategy for the next 20 years (2016-2035)105.

The Strategy was published by the Romanian Academy in 2015. The Strategy covers key areas of the Romanian society and draws guidelines for the development of its future. The activities in the scope of Strategy for the period 2016-2035 were launched in 2014 with the identification of 11 projects interdisciplinary topics, and working groups involved in its development (about 200 researchers and experts). The second phase (February- June 2015) was conducted on the basis of contracts to Romanian Academy institutes (financed from revenues of the Romanian Academy). This phase concludes with the report Romania's development strategy for the next 20 years. The report summarizes a SWOT analysis in each key area and the vision of the status that Romania should reach in 2035 and its position in Europe. In the third phase (December 2015) short-, medium- and long-term integration scenarios will be worked out, followed by the identification of the needed resources (fourth Phase, December 2016). The Romanian language is part of the key areas Romanian culture, education, and Romania in the era of globalization The Strategy considers language resources and language technology of importance for the development in the field of artificial intelligence (machine learning, information retrieval and analysis of big data) in the scope of the key area information security (cyber security, intellectual property).

In the National Strategy on Digital Agenda for Romania of 2014106, in which objectives set by the Digital Agenda for Europe 2020 were taken and adapted to the current context of Romania, language and language technology are not explicitly mentioned.

The research projects in the field of language technology and resources on Romanian have been funded nationally (e.g. eDTLR –the Romanian Thesaurus Dictionary in electronic form107, CoRoLa, a representative corpus of contemporary Romanian), a large number of projects was funded by various European programs.

104 http://www.culturalpolicies.net/web/romania.php?aid=425 105 Strategia de dezvoltare a României în următorii 20 de ani (http://acad.ro/bdar/strategiaAR/doc11/Strategia.pdf , retrieved 27 August 2015). 106https://www.mcsi.ro%2FTransparenta-decizionala%2FProiecte-2014%2FDigital-Agenda-Strategy-for-Romania%2C-8-september- 2&usg=AFQjCNGmeWdnp3SuzL24rjC6XB11Wc26Uw&sig2=DzLPPrcnjMfIfDp27jn7Mw&bvm=bv.102022582 ,d.d24 107 http://profs.info.uaic.ro/~dcristea/papers/Cristea%20et%20al-SPeD07.pdf 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 81 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

21. SLOVENIA 21.1 BACKGROUND 21.1.1 COUNTRY CHARACTERISTICS Slovenia (Republic of Slovenia) is a country in southern Central Europe. It borders Italy to the west, Austria to the north, Hungary to the northeast, Croatia to the south and southeast, and the Adriatic Sea to the southwest. Slovenia is one of the smallest EU countries, covering 20,273 square kilometers. It has a population of 2.05 million108. Its capital and largest city is Ljubljana.

Slovenia is located in the contact zone between the Slavic, Germanic, Romance, and Uralic linguistic and cultural spaces, which makes Slovenia one of the most complex linguistic meeting points in Europe. Historically, the current territory was part of a number of different state formations, including the Roman Empire, the Holy Roman Empire, followed by the Habsburg Monarchy, Illyrian Provinces, Austria-Hungary, Kingdom of Serbs, Croats and Slovenes, and Yugoslavia. Slovenia split from Yugoslavia and became an independent country in 1991. It was the first former Yugoslav republic to join the European Union, in May 2004.

21.2 LANGUAGES The official language is Slovenian, whereas Italian and Hungarian are co-official regional minority languages in those municipalities where the Italian and the Hungarian minority are present. In the 2002 census, 87.8% of population declared Slovenian as their mother language109. Slovenian is a South Slavic language.

Other significant languages are the languages spoken by immigrants from the former Yugoslavia and their descendants. However, these languages do not have an official status in Slovenia.

Historically, German was the lingua franca in Central Europe and was also used in commerce, science and literature in Slovenia. Consequently, German used to be the first foreign language taught in schools. With the formation of Yugoslavia, the so-called Serbo-Croatian became the language of federal authorities and the first foreign language taught in school.

Nowadays, English is taught as the first foreign language throughout the country. German has retained a strong position as an important language and is the most common second foreign language in high schools. Other foreign languages widely taught are Italian, Spanish, French and Hungarian. At least one foreign language is a compulsory subject in the Slovenian secondary school leaving exam (matura). Slovenia is ranked among the top European countries regarding the knowledge of foreign languages.

108 http://ec.europa.eu/eurostat/web/population-demography-migration-projections/population-data/main-tables (retrieved 5 August 2015). 109 http://www.stat.si/StatWeb/glavnanavigacija/podatki/prikazistaronovico?IdNovice=2957 (accessed 5 August 2015). 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 82 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Slovenia has signed the European Charter for Regional or Minority Languages in 1997. In their latest recommendation in 2014 the Council of Europe recommended recognising German, Croatian and Serbian as minority languages traditionally spoken in Slovenia110.

21.2.1 OFFICIAL The official language is Slovenian. The legal status of the Slovenian language is established by the Constitution111 and by the Public Use of the Slovenian Language Act112. Furthermore, different aspects of language use are regulated by more than 160 legal acts (e.g. legal acts on consumer rights, media, etc.)113

21.2.2 CO-OFFICIAL LANGUAGES Hungarian and Italian, spoken by the respective minorities, enjoy the status of co-official languages in the regions along the Hungarian and Italian borders. The legal status is established by the Constitution.

21.3 RELEVANT ORGANISATIONS  Ministry of Culture of the Republic of Slovenia, Department for Slovenian Language114  Fran Ramovš Institute of the Slovenian Language115 The Institute of Slovenian Language was established in 1945 for the purpose of compiling linguistic materials and using them for the creation of Slovenian language resources. It is part of the The Research Centre of the Slovenian Academy of Sciences and Arts.  Centre for Slovene as a Second/Foreign Language116 Operates under the auspices of the Department of Slovene Studies at the Faculty of Arts of the University of Ljubljana.  “Jožef Stefan” Institute At the “Jožef Stefan” Institute three departments are involved in language technologies research for both Slovenian and English: Artificial Intelligence Laboratory117, Department of Knowledge Technologies118, Department of Intelligent Systems119.  University of Ljubljana120

110 https://www.coe.int/t/dg4/education/minlang/Report/Recommendations/SloveniaCMRec4_en.pdf (retrieved 10 August 2015) 111 http://www.us-rs.si/o-sodiscu/pravna-podlaga/ustava/ (accessed 20 July 2015). 112 http://www.uradni-list.si/1/objava.jsp?urlid=200486&stevilka=3841 (accessed 20 July 2015). 113 http://www.mk.gov.si/si/zakonodaja_in_dokumenti/veljavni_predpisi/slovenski_jezik/podrocni_zakoni/ (accessed 20 July 2015). 114 http://www.mk.gov.si/si/delovna_podrocja/sluzba_za_slovenski_jezik/ 115 http://isjfr.zrc-sazu.si/en 116 http://www.centerslo.net/ 117 http://ailab.ijs.si 118 http://kt.ijs.si/ 119 http://dis.ijs.si/ 120 http://www.uni-lj.si/eng/ 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 83 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Research on Slovenian language is undertaken at the Department for Slovene Language. Language technology research is carried out at the at the Faculty of Arts, Faculty of Social Sciences, Faculty of Electrical Engineering and the Faculty of Computer and Information Science.  University of Maribor121 In addition to the research on Slovenian language conducted at the Faculty of Arts, research on language and speech technologies at the University of Maribor is undertaken mainly at the Faculty of Electrical Engineering and Computer Sciences, in the scope of the Institute for Electronics and Telecommunications and the Institute for Computer Science (Laboratory for Heterogenous Computer systems).  University of Nova Gorica122 At the University of Nova Gorica, linguistics is studied at the Center for Cognitive Science of Language. The primary areas of interest are theoretical and experimental syntax, morphology, semantics, and pragmatics. A considerable amount of time is also devoted to sociolinguistic, language-policy, language-planning issues, and various other applied aspects of linguistics.  University of Primorska123 Language technology research and corpus linguistics are undertaken at the Faculty for Mathematics, Natural Sciences and Information Technologies, mainly at the Department of Information Sciences and Technologies. In addition to the public bodies and higher education institutions mentioned above, the following private research institutes, companies and associations are active in the field of language technology and language resources for Slovenian:  Alpineon, d.o.o.124 Development of hardware and software in the field of language and speech technologies: speech recognition and synthesis, machine translation, voice portals, SMS and email readers.  Amebis, d.o.o.125  Language technology software: spell and syntax checkers, machine translations, speech synthesis, corpora, online dictionaries, virtual agents, etc.  Trojina, Institute for Applied Slovene Studies126 Trojina is undertakes projects aimed at modern, targeted linguistic research and at increasing the confidence of speakers in public and private use of the Slovenian language.  Slovenian Language Technologies Society127 The Society was founded in 1998. Its activities are aimed at promoting the development of language technologies for the Slovenian language.

121 www.um.si/en/Pages/ 122 www.ung.si/en/ 123 www.upr.si/ 124 www.alpineon.com 125 www.amebis.si/ 126 www.trojina.si/en/ 127 www.sdjt.si/index-en.html 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 84 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

21.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 21.4.1 CURRENT NATIONAL POLICIES AND STRATEGIES Public Use of the Slovene Language Act, adopted in 2004, gives a legal basis for the linguistic policy in Slovenia. In 2007, the National Programme for Language Policy for the period of 2008-2011 was adopted as the main instrument predicted by the Public Use of the Slovenian Language Act. The Resolution on the National Programme for Language Policy for the period of 2014-2018 was adopted in 2013128. It identified a series of goals and measures for implementation at the inter-ministerial level. The measures support inter alia development of linguistic capacities of all groups of speakers in order to improve their reading skills, promote language skills that will be comparable to those of other European countries, and develop and promote the public use of the Slovenian language.

The main change in to the first Programme is a shift from the field of protection of Slovenian language to the field of language education and to the field of language equipment (resources, technology, digitalisation, standardisation, language description, terminology and multilingualism, etc.). More attention is given to language policies of speakers with special needs.

The Action Plan for Language Equipment prepared in 2014 identified nine areas in which action was needed: general (infrastructure, bibliometrics, wikis, etc.), online portals (language portal, terminology portal, school portal, etc.), corpora, dictionaries and lexica, grammar, digitalisation, language technology applications, sign language. The Action Plan also lists costs estimates for each action and a time plan129.

The Research Infrastructure Development Plan 2011-2020130 identified humanities as one of the priority research areas, with language technologies and language resources for Slovene as one of the main focuses. This document included CLARIN as one of the priority research infrastructures. The Slovene CLARIN, CLARIN.SI, was established in 2013, and became a member of CLARIN ERIC in 2015131.

128 http://www.mk.gov.si/fileadmin/mk.gov.si/pageuploads/Ministrstvo/slovenski_jezik/Resolution_2014-18_Slovenia_jan_2015.pdf (retrieved 10 August 2015). 129http://www.mk.gov.si/fileadmin/mk.gov.si/pageuploads/Ministrstvo/raziskave - analize/slovenski_jezik/Akcijski_nacrt_za_jezikovno_opremljenost_javna_razprava_popravljeno2.pdf (retrieved 20 August 2015). 130 http://www.arhiv.mvzt.gov.si/fileadmin/mvzt.gov.si/pageuploads/pdf/znanost/nacrt-RI.pdf (retrieved 5 August 2015) 131 http://www.clarin.si/info/general-information/ (accessed 5 August 2015). 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 85 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

22. SPAIN 22.1 BACKGROUND 22.1.1 COUNTRY CHARACTERISTICS Spain, officially the Kingdom of Spain, is one of the biggest countries in the European Union. Located in south- western Europe, its mainland is bordered to the south and east by the Mediterranean Sea except for a small land boundary with Gibraltar; to the north and northeast by France, Andorra, and the Bay of ; and to the west and northwest by Portugal and the Atlantic Ocean. Along with France and Morocco, it is one of only three countries to have both Atlantic and Mediterranean coastlines. Spanish territory also includes two archipelagos; the , in the Mediterranean Sea, and the Canary Islands, in the Atlantic Ocean off the African coast; two major exclaves, Ceuta and Melilla, in continental North Africa; and several islands and peñones (rocks). The capital of the country is , located in the middle of the mainland territory and has a population of 46.464.053 in July 2014 (National Statistics Institute).

Spain is comprised by 17 autonomous communities (regions) and 2 autonomous cities. Autonomous communities are integrated by provinces, of which there are 50 in total, and in turn, provinces are integrated by municipalities. The basic institutional law of each autonomous community is the .

22.1.2 LANGUAGES OVERVIEW SITUATION Spain is one of the richest countries in Europe, according to the amount and variety of official and recognised languages spoken in its territory. Spanish is the official language in all the country and is spoken by the vast majority of the population (99%) and is the mother tongue of around an 89% of the total population.

At the same time, there is a set languages recognised by the Statutes of Autonomy which are co-official in some of the autonomous communities mentioned above. There are also some languages and dialects that are recognised by different Regional Laws in some cases but that are not official or co-official in their respective region or territory.

22.2 LANGUAGES 22.2.1 OFFICIAL The only official language in all the territory of Spain is Spanish. Spanish is the second spoken language all over the world after Chinese, considering the native speakers, but the third by number of speakers after English. According to Instituto Cervantes there are around 470 million Spanish speakers with native competence and 548 million Spanish speakers as a first or second language, including speakers with limited competence and 20 million students of Spanish as a foreign language. It is expected that in 2030, Spanish will be the second language for business all over the world, due to the growing Latin-American market.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 86 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

In Spain, Spanish is also called Castilian (castellano), as there are other recognised Spanish languages within the country. The Spanish constitution literally says “Castilian” is the state official language. All the Spanish “have the obligation to know it and the right to use it”. At the same time it remarks “other Spanish languages will be also official in their correspondent Autonomous Communities according to their Statutes”.

22.2.2 CO-OFFICIAL LANGUAGES There are three major co-official languages in Spain depending on the administration of their own autonomous communities. These are: Catalan, Galician and Basque. At the same time, since 2006, the (spoken in a small area of ) has been officially recognised as a co-official language in Catalonia. Aragonese, Asturian and Leonese are minority languages, recognised, but not official.

22.2.2.1 CATALAN Origin Catalan developed from the spoken in the Roman province of Hispania Tarraconensis. The disintegration and fall of the Roman Empire brought about several successive invasions. The Visigoths (414 AD) and the Arabs () (711-717 AD) subjugated the entire peninsula, but their languages had a little impact on Catalan. In 778 the Franks of Charlemagne conquered a narrow strip southward of the with Barcelona and established there the so called (Spanish mark) as a buffer state against the Muslims. The local Romance idiom since then evolved in close relations with the language of Southern Gaul (see ). In this period Provençal was considered a language of prestige and was adopted by the Catalonian troubadours also. In spite of the various influences from Gaul, Catalan, however, never assumed the two-case system, unique to and Occitan.

By the end of the 10th century Catalan was already a fully-formed language, clearly distinguishable from its Latin origins. During the 13th and 14th centuries Catalan reached its high point of geographical expansion in the Iberian Peninsula through the conquest of the kingdoms of and . The language also spread around the Mediterranean through victory over the kingdoms of Majorca, Sicily, Sardinia (even today there remains a Catalan-speaking population in the area of ), Naples, Athens and Neopatria in Peloponnese. Catalan came to be spoken, even if not always as a first language, in five states around the Mediterranean which were governed by Catalan dynasties. Due to the Royal Chancellery, whose style was strongly influencing for all Catalan writing, the prose of the 14th and 15th centuries was marked by a high degree of uniformity.

Catalan retained its vigour until the union of the Aragonese and Castilian crowns in 1474. After that, although mainly grammatical works appeared, it gradually entered a period of decline. The Catalan (Renaixença) began in the late 19th century with the economic progress of Catalonia. During the Second Republic (1931-1939), Catalan was restored to its official language status, but this promising development was checked by the Civil War and its consequences. The use of Catalan in public was forbidden and the language retreated into

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 87 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY the home. Ever since the restoration of democratic institutions, there has been a process to re-establish the use of Catalan.

AREAS WHERE SPOKEN Catalan (Català) is a Western Romance language spoken in eastern and northeastern Spain, chiefly in Catalonia, Valencia, the Balearic Isles, the eastern fringe of () and in some municipalities of Murcia. Outside of Spain it is also spoken in the Roussillon region of France, in the northwest Sardinian city of Alghero (l'Alguer), in the small state of Andorra and among the emigrants in the USA.

Catalan covers an area of 68,000 km2 with a population of 10 million. It has around 7,353,000 speakers in the world distributed as follows:  4,000,000 mother tongue speakers in Spain (1994 La );  260,000 in France;  31,000 in Andorra (1990);  40,000 in USA (1961);  22,000 in Alghero.

It is estimated that some other 3,000,000 people in Spain speak Catalan as their second or third language, with 2 million more understanding but not being able to speak it.

The official language of the between 1137 and 1749, Catalan is now co-official (with Spanish) in Catalonia, Valencia and the Balearic Isles. It is the only official language in the state of Andorra.

LEGAL FRAMEWORK The legal framework on language in Spain is to be found in the 1978 Constitution, mainly in article 3, and in the statutes of autonomy of Catalonia, Valencia, the Balearic Islands and Aragon. It is implemented in Catalonia through the 1998 law on language policy (which replaces the 1983 law), in the Balearic Islands through the 1986 law on language policy and in Valencia through the 1983 law on the use and teaching of Valencian. In accordance with this legislation, Catalan is the language proper to Catalonia, the Balearic Islands and Valencia and is also an official language in these areas, alongside Spanish. In Andorra, Catalan is the only official language according to article 2 of the 1993 Constitution of the Principality of Andorra. Neither North Catalonia nor L'Alguer have their own law on language.

In addition, on 11 December 1990, the approved the "Resolution on the situation of languages in the Community and on the ". This resolution recognizes the identity, current validity and the use of Catalan within the context of the European Union and proposes that Catalan be included in certain actions undertaken by European institutions.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 88 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

22.2.2.2 GALICIAN ORIGIN Latinate Galician charters from the 8th century onward show that the local written Latin was heavily influenced by local spoken Romance, yet it is not until the 12th century that we find evidences for the identification of the local language as a language different from Latin itself. The linguistic stage from the 13th to the 15th centuries is usually known as Galician-Portuguese (or Old Portuguese, or Old Galician) as an acknowledgement of the cultural and linguistic unity of Galicia and Portugal during the Middle Ages, as both linguistic varieties differed only in dialectal minor phenomenon, and were considered by the contemporary as just one language. Galician- Portuguese lost its unity when the obtained its independence from the Kingdom of Leon, a transition initiated in 1139 and completed in 1179, establishing the . Portuguese was the official language of the Portuguese chancellery, while Galician was the usual language not only of troubadours and peasants, but also of local noblemen and clergy, and of their officials, so forging and maintaining two slightly different standards. In spite of Galician being the most spoken language, during the 17th century the elites of the Kingdom began speaking Castilian, most notably in towns and cities. During the 19th century a thriving literature developed, in what was called the (Resurgence), of the . But then, with the , the written or public use of the Galician language was outlawed. With the advent of democracy, Galician has been brought into the country's institutions, and it is now co-official with Spanish in Galicia. Galician is taught in schools, and there is a public Galician-language television channel, Televisión de Galicia. Today, the most common language for everyday use in the largest cities of Galicia is Spanish rather than Galician, as a result of this long process of . Galician is still the main language in the rural areas, though.

AREAS WHERE SPOKEN Galician is spoken by some 2.4 million people, mainly in Galicia, an autonomous community located in northwestern Spain, where it is official along with Spanish. The language is also spoken in some border zones of the neighbouring Spanish regions of and Castile and León, as well as by Galician migrant communities in the rest of Spain, in Latin America, the United States, Switzerland and elsewhere in Europe.

LEGAL FRAMEWORK The legal framework on language in Spain is to be found in the 1978 Constitution, mainly in article 3 and in the article 5 of the statutes of autonomy of Galicia, where it is stated that the Galician language is an official language in the Galician Autonomy. Additionally the Law of Galician language normalization (2004) support the promotion of the use of Galician in the Galician society. In the other Spanish regions where the Galician is spoken, they are not so many laws for the protection of the Galician.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 89 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

22.2.2.3 BASQUE ORIGIN Geographically surrounded by Indo-European , Basque is classified as a language isolate. It is the last remaining descendant of the pre-Indo-European languages of Western Europe. At the , the became the main everyday language, while other languages like Spanish, Gascon, French, or Latin were preferred for the administration and high education. By the 16th century, the Basque-speaking area was reduced basically to the present-day seven provinces of the Basque Country, excluding the southern part of Navarre, the southwestern part of Álava, and the western part of Biscay, and including some parts of Béarn. In 1807, Basque was still spoken in the northern half of Alava—including its capital city Vitoria-Gasteiz and a vast area in central Navarre, but in these two provinces, Basque experienced a rapid decline that pushed it northwards. In the , Basque was still spoken in all the territory except in Bayonne and some villages around, and including some bordering towns in Béarn.

In the 20th century, however, the rise of Basque nationalism spurred increased interest in the language as a sign of ethnic identity, and with the establishment of autonomous governments in the Spanish Basque Country, it has recently made a modest comeback. In the Spanish part, Basque-language schools for children and Basque- teaching centres for adults have brought the language to areas such as Encartaciones and the Ribera in Navarre, where it is not known if it has ever been spoken before; and in the French Basque Country, these schools and centres have almost stopped the decline of the language.

AREAS WHERE SPOKEN Native speakers live in a contiguous area that includes parts of four Spanish territories and the three "ancient provinces" in France. Gipuzkoa, most of Biscay, a few municipalities of Álava, and the northern area of Navarre formed the core of the remaining Basque-speaking area before measures were introduced in the 1980s to strengthen the language. By contrast, most of Álava, the western part of Biscay and central and southern areas of Navarre are predominantly populated by native speakers of Spanish, either because Basque was replaced by Spanish along the centuries, in some areas (most of Álava and central Navarre), or because it was possibly never spoken there, in other areas (Encartaciones and southeastern Navarre).

Under Restorationist and , the public use of Basque was suppressed and regarded as a sign of separatism. A standardized form of the Basque language, called Euskara Batua, was developed by the Basque Language Academy in the late 1960s. Apart from this standardized version, the five main Basque dialects are Bizkaian, Gipuzkoan, and Upper Navarrese in Spain, and Navarrese–Lapurdian and Zuberoan in France. Although they take their names from the historic Basque provinces, the dialect boundaries are not congruent with province boundaries. Euskara Batua was created so that Basque language could be used—and easily understood by all Basque speakers—in formal situations (education, mass media, literature), and this is its main use today. In both Spain and France, the use of Basque for education varies from region to region and from school to school.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 90 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Basque is spoken by 27% of Basques in all territories (714,136 out of 2,648,998). Of these, 663,035 are in the Spanish area of the Basque Country and the remaining 51,100 are in the French portion.

LEGAL FRAMEWORK The Spanish Constitution of 1978 states in Article 2 that the is the official language, but allows autonomous communities to provide a co-official language status for the other . Consequently, the Statute of Autonomy of the Basque Autonomous Community establishes Basque as the co-official language of the autonomous community. The Statute of Navarre establishes Spanish as the official language of Navarre, but grants co-official status to the Basque language in the Basque-speaking areas of northern Navarre. Basque has no official status in the French Basque Country and French citizens are barred from officially using Basque in a French court of law. However, the use of Basque by Spanish nationals in French courts is permitted (with translation), as Basque is officially recognized on the other side of the border.

The positions of the various existing governments differ with regard to the promotion of Basque in areas where Basque is commonly spoken. The language has official status in those territories that are within the Basque Autonomous Community, where it is spoken and promoted heavily, but only partially in Navarre. The Ley del Vascuence ("Law of Basque"), seen as contentious by many Basques, but considered fitting Navarra's linguistic and cultural diversity by the main political parties of Navarre, divides Navarre into three language areas: Basque- speaking, non-Basque-speaking, and mixed. The support for the language and the of citizens vary depending on which of the three areas they are located.

22.2.2.4 ARANESE ORIGIN Aranes is a standardized form of the Pyrenean Gascon variety of the Occitan language spoken in the Val d'Aran.

AREAS WHERE SPOKEN The Aranes is spoken in the Val d'Aran, in northwestern Catalonia close to the Spanish border with France, where it is one of the three official languages beside Catalan and Spanish. The Aranes is the mother tongue for 34,2% of the population in the Val d’Aran. It is the second spoken language in the Val d’Aran, after the Spanish. Additionally in the Val d’Aran el 19% of the population speaks Catalan.

LEGAL FRAMEWORK In 2010, it was named the third official language of the whole of Catalonia by Parliament of Catalonia. The statute of Autonomy of Catalonia (2006), also stated the Aranes as an official language in Catalonia, according to the laws of linguistic normalization laws.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 91 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

22.3 RELEVANT ORGANISATIONS The institutions responsible for overseeing the official and co-official languages in Spain are:  Spanish . Real Academia Española (http://www.rae.es/ ): its main mission is to watch that the changes that suffer the Spanish language because of the needs of their speakers don’t break the unity of all the Spanish speakers. . Asociación de Academias de la Lengua Española (http://www.asale.org/): its mission is to support the unity of the Spanish language between all the Spanish speakers. . Instituto Cervantes (http://www.cervantes.es/default.htm ): its main mission is to promote and teach Spanish and co-official languages for disseminating the Spanish and American culture. . Fundeu BBA (http://www.fundeu.es/): it mission is to promote the good use of the Spanish in media.  Catalan . Institut d'Estudis Catalans (http://www.iec.cat/activitats/entrada.asp) is the institution that joint the elite in research in the area of the Catalan linguistics. . Acadèmia Valenciana de la Llengua (http://www.avl.gva.es/inici.html). It is the official standardisation body for the . It forms part of the institutions of the , the Valencia Regional Government  Galician . Real Academia Galega (http://academia.gal) its main mission is the promotion of the Galicia language and culture. . Centro Ramón Piñeiro para la Investigación en Humanidades (CIRP) for Galician Language (http://www.cirp.es/). It is in charge of promoting and disseminating, actions, projects and programs about Galician linguistic, literary, historic and anthropologic studies.  Basque . Euskaltzaindia (Basque Language) (http://www.euskaltzaindia.eus/index.php?lang=en): The Royal Academy of the Basque Language (1919) is the official body responsible for Basque, which is the Basque language. It carries out research on the language and its object is to safeguard it; the Academy has formulated the rules for the normalisation of the language  Aranes . Institut d’Estudis Aranesi-Académia Aranesa dera Lengua Occitana (http://www.conselharan.org/es/). It carries out research on the language and its object is to safeguard it

Additionally, there are the following Terminology Associations:  AETER: Spanish Association of Terminology (http://www.aeter.org)  SCATERM: Catalan Society of Terminology. (http://blogs.iec.cat/scaterm/)

And organisations with relevant terminology and lexicographic works (like dictionaries, terminology banks,..):

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 92 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Real Academia Española ( http://www.rae.es/ )  Real Academia de Ingeniería (http://www.raing.es)  Real Academia de Medicina (http://www.ranm.es/)  Real Academia de Ciencias Exactas, Físicas y Naturales (http://www.rac.es )  Real Academia Nacional de Farmacia (http://www.ranf.com/)  Instituto de Estudios Documentales de Ciencia y Tecnología (EDCYT) del CSIC (http://www.iedcyt.csic.es/)  TERMINESP (http://www.automatictrans.es/ )  TERMCAT (http://www.termcat.cat/)  Centro Vasco de Terminología y Lexicografía (UZEI) (http://www.euskonews.com/0023zbk/gaia2302es.html )

Spain has powerful researches in LT. The next diagram includes the most relevant research institutions.

FIGURE 10 SPANISH RELEVANT RESEARCH INSTITUTIONS

The Spanish industry associated to languages and languages technologies also have a relevant position in the market. This sector is represented by:  Sociedad Española para el procesamiento del Lenguaje Natural.  Plataforma del español (Cluster inside Madrid Network)  Langune (Language Industry Association of Basque Country)  Clúster Catalán de las Industrias de la Lengua (CLUSTERLINGUA) ´

The main industries in language technologies by regions are:

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 93 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Madrid:  IBM Voice Technology Development - Spain Group  Telefónica I+D, División de Tecnología del Habla  Daedalus data decisions and language,  Celer soluciones,  Seprotec multilingual Solutions,  Habla Computing,  Molino de Ideas,  Bitext innovations,  Paradigma Tecnológico,  Mamvo Performance

Galicia: Quobis Networks

Extremadura: Oteara

Valencia:  Siimbiotika,  Innovative Social Technologies,

Baleares: Simach 2010

Navarra: Ibercentro The main editorials in Spain are:  Editorial Tirant Lo Blanc  Calamo y Cran  Comercial de ediciones SM  Aralia Editores

22.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 22.4.1 PAST AND CURRENT SITUATION In Spain, until now, Language Technologies strategies or policies have not been defined. There were, and there are, laws and regulations that support the promotion, use and learning of Spanish and co-official languages in the corresponding regions, as it has been described in the previous chapter. Regional governments, like Basque, Catalonia and Galician Governments, publish public calls for funding translations or generation of contents to their official languages. Nevertheless, related to languages technologies, the support to development of new

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 94 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY products, solutions based on language technologies, it is included in programmes that support ICT and or generic research and innovation projects. The same situation happens at national level.

Therefore language technologies are supported by the national and regional governments, through their policies, strategies and programmes related to Information and Communication Technologies (ICT). In fact, in the National Plan of R&I (2008 and 2013) inside the Strategic Action of Telecommunication and Information Society one of the objectives was the Technologies for the Natural Languages Processing132.

Now in the National Plan of R&I (2013 – 2016) the lCT technologies are driven by the Digital Agenda for Spain133. The Digital Agenda for Spain, approved on February 15th 2013, is the Government’s strategy to develop the digital economy and society in Spain during 2013-2015. This strategy is thought as the umbrella of all the Government’s actions in terms of Telecommunications and Information Society. The Agenda, led by the Ministry of Industry, Energy and Tourism (MINETUR) and The Ministry of Finance and Public Administrations, sets the ICT and e-Administration roadmap to achieve the goals of the Digital Agenda for Europe in 2015 and 2020. In this Agenda language technologies are not explicitly quoted, but some relevant areas where the language technologies can play a relevant role are included:  Big Data  Digital Inclusion Plan  ICT in SME and e-commerce plan  Technology Company Internationalization Plan

This strategy is implemented through public grants to research and innovation projects, training and support for talent incorporation.

22.4.2 FUTURE STRATEGIES AND POLICIES Inside the Spanish Secretary of the Telecommunication and Information Society (SETSI), the awareness about the relevance of Language Technologies at Research and Industrial level is increasing. Consequently, some movements are starting to define new policies/strategies for supporting and promoting the sector. This new situation would be favoured, since the SETSI is the institution in charge of the Digital Agenda for Spain and this Digital Agenda drives the roadmap for the next years in Information and Communication Technologies. In fact, the SETSI is starting to consider that language technologies can support several of the main objectives of the Digital Agenda:  Development of the digital economy for the growing, competitiveness, internationalization of the Spanish companies

132 R&I National Plan (page 141) http://www.idi.mineco.gob.es/stfls/MICINN/Investigacion/FICHEROS/PLAN_NACIONAL_CONSEJO_DE_MINISTROS.pdf 133 http://www.agendadigital.gob.es/digital-agenda/Documents/digital-agenda-spain-slideshow-presentation.pdf 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 95 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Improvement of the electronic administration and the public digital services  Promotion of the Research, Development and Innovation in the industries of the future  Promotion of the digital inclusion and alphabetization and training for the new ICT professionals  ICT plan in SMEs and e-commerce  Internationalization plan of the technological companies.

Therefore, although nothing has been confirmed or officially published so far, it seems very possible that a new Plan/Strategy for supporting language technologies as a strategic sector in Spain would can be defined in the near future by the SETSI in the framework of the Digital Agenda. The Spanish Ministry is considering certain steps towards a language technologies strategy that will be published at a later stage - this year or 2016 – but there is no official statement yet134.

134 Personal communication subject to confidentiality 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 96 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

23. SWEDEN 23.1 BACKGROUND 23.1.1 COUNTRY CHARACTERISTICS Sweden, officially the Kingdom of Sweden, is a Scandinavian country in Northern Europe. It borders Norway and Finland, and is connected to Denmark by a bridge-tunnel across the Öresund. At 450,295 square kilometres (173,860 sq mi), Sweden is the third-largest country in the European Union by area, with a total population of over 9.7 million. Sweden consequently has a low population density of 21 inhabitants per square kilometre (54/sq mi), with the highest concentration in the southern half of the country. Approximately 85% of the population lives in urban areas. Southern Sweden is predominantly agricultural, while the north is heavily forested. Sweden is part of the geographical area of Fennoscandia.

Today, Sweden is a constitutional monarchy and a parliamentary democracy, with the Monarch as the head of state. The capital city is , which is also the most populous city in the country. Legislative power is vested in the 349-member unicameral . Executive power is exercised by the Government, chaired by the Prime Minister. Sweden is a unitary state, currently divided into 21 counties and 290 municipalities.

Sweden maintains a Nordic social welfare system that provides universal health care and tertiary education for its citizens. It has the world's eighth-highest per capita income and ranks highly in numerous metrics of national performance, including quality of life, health, education, protection of civil liberties, economic competitiveness, equality, prosperity and human development. Sweden has been a member of the European Union since 1 January 1995, but declined Eurozone membership following a referendum. It is also a member of the United Nations, the Nordic Council, Council of Europe, the World Trade Organization and the Organisation for Economic Co-operation and Development (OECD).

23.1.2 LANGUAGES OVERVIEW SITUATION Swedish is a North Germanic language, spoken natively by about 9 million people predominantly in Sweden and parts of Finland, where it has equal legal standing with Finnish. It is largely mutually intelligible with Norwegian and Danish (see Classification). Along with the other North , Swedish is a descendant of Old Norse, the common language of the Germanic peoples living in Scandinavia during the Viking Era. It is currently the largest of the by number of speakers. , spoken by most , is the national language that evolved from the Central in the 19th century and was well established by the beginning of the 20th century. While distinct regional varieties descended from the older rural dialects still exist, the spoken and written language is uniform and standardized. In 1999, the Minority Language Committee of Sweden formally declared five national minority languages of Sweden: Finnish, Meänkieli (also known as Tornedal, Tornionlaaksonsuomi or Tornedalian), the Sami languages, Romani, and .

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 97 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

23.2 LANGUAGES 23.2.1 OFFICIAL Swedish is the official language of Sweden and is spoken by the vast majority of the nine million inhabitants of the country. It is a North Germanic language and quite similar to its sister Scandinavian languages, Danish and Norwegian.

23.2.2 CO-OFFICIAL LANGUAGES In 1999, the Minority Language Committee of Sweden formally declared five minority languages of Sweden: Finnish, Meänkieli (also known as Tornedal, Tornionlaaksonsuomi or Tornedalian), the Sami languages, Romani, and Yiddish.

23.2.2.1 FINNISH Today there are about 470,000 Finnish-speakers in Sweden.[3] Finnish, a Uralic language, has long been spoken in Sweden (the same holds true for Swedish in Finland), as Finland was part of the Swedish kingdom for centuries. Ethnic Finns (mainly first and second generation immigrants) constitute up to 5% of the population of Sweden. A high concentration of Finnish-speakers (some 16,000) resides in .

23.2.2.2 MEÄNKIELI Meänkieli is also a Finnic language. Spoken by the Tornedalian people, it is so closely related to Finnish that they are mutually intelligible, and is sometimes considered a dialect of Finnish. Meänkieli is mainly used in the municipalities of Gällivare, , , and Övertorneå, all of which lie in the Torne Valley. Between 40,000 and 70,000 people speak Meänkieli as their first language.

23.2.2.3 THE SAMI LANGUAGES The Sami people (formerly known as Lapps) are a people indigenous to all of northern Scandinavia (see Sápmi (area)) who speak a closely related group of languages usually grouped together under the name "Sami", although at least three separate Sami Languages are spoken in Sweden. The languages are, like Finnish and Meänkeli, Uralic. Due to prolonged exposure to Germanic-language-speaking neighbors in Sweden and Norway, Sami languages have a large number of Germanic , which are not normally found in other like Finnish, Estonian, or Hungarian. Between 15,000 and 20,000 Sami people live in Sweden of whom 9,000 are Sami-language speakers. Worldwide, between 20,000 and 40,000 people speak Sami Languages (most Sami now speak Swedish, Norwegian, Finnish, or Russian as their first language, depending on the country in which they reside). In Sweden, the largest concentrations of Sami-language-speaking Sami are found in the municipalities of , Gällivare, and Kiruna, and its immediate neighbourhood.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 98 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

23.2.2.4 ROMANI Romani (also known as the Romani Chib) is the language spoken by the , a nomadic ethnic group originating in northern India. Due to the geographic origins of its speakers, Romani is an Indo-Aryan language, closely related to languages spoken in modern-day India, and sometimes written with an Indic Script . Around 90% of Sweden's Romani people speak Romani, meaning that there are approximately 9,500 Chib speakers. In Sweden, there is no major geographic center for Romani like there is for Finnish, Sami, or Meänkieli, but it is considered to be of historical importance by the Swedish government, and as such the government is seen as having an obligation to preserve them, a distinction also held by Yiddish.

23.2.2.5 YIDDISH Yiddish is a Germanic language with significant Hebrew and Slavic influence, written with a variant of the Hebrew Alphabet and, formerly, spoken by most Ashkenazic Jews (although most now speak the language of the country in which they live). Although the Jewish population of Sweden was traditionally Sephardic, after the 18th century, Ashkenazic immigration began, and the immigrants brought with them their Yiddish language . Like Romani, it is seen by the government to be of historical importance. The organization Sällskapet för Jiddisch och Jiddischkultur i Sverige (Society for Yiddish and Yiddish Culture in Sweden) has over 200 members, many of whom are mother- tongue Yiddish speakers, and arranges regular activities for the speech community and in external advocacy of the Yiddish language.

As of 2009, the Jewish population in Sweden was estimated at around 20,000. Out of these 2,000-6,000 claim to have at least some knowledge of Yiddish according to various reports and surveys. The number of native speakers among these has been estimated by linguist Mikael Parkvall to be 750-1,500. It is believed that virtually all native speakers of Yiddish in Sweden today are adults and most of them elderly.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 99 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

FIGURE 11 MAP OF LANGUAGES IN SWEDEN AND ADJACENT SCANDINAVIAN COUTNRIES (SOURCE: HTTP://ARCHIVE.ETHNOLOGUE.COM)

23.3 RELEVANT ORGANISATIONS Institutet för språk och folkminnen – Institue for Langauge and Folklore (http://www.sprakochfolkminnen.se) Språkförsvaret http://www.språkförsvaret.se (Swedish only)

23.4 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 23.4.1 FINNISH Finnish has the status of national minority languages in Sweden. The status of minority languages means that Sweden takes a cultural-political responsibility for these languages to survive. Minority Language Speakers have the right to use their language in contact with authorities in those regions designated as so-called administrative areas of language.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 100 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

23.4.1.1 MEÄNKIELI Meänkiel has the status of national minority languages in Sweden. The status of minority languages means that Sweden takes a cultural-political responsibility for these languages to survive. Minority Language Speakers have the right to use their language in contact with authorities in those regions designated as so-called administrative areas of language.

23.4.1.2 THE SAMI LANGUAGES Sami has the status of national minority languages in Sweden. The status of minority languages means that Sweden takes a cultural-political responsibility for these languages to survive. Minority Language Speakers have the right to use their language in contact with authorities in those regions designated as so-called administrative areas of language.

23.4.1.3 ROMANI Romani has the status of national minority languages in Sweden. The status of minority languages means that Sweden takes a cultural-political responsibility for these languages to survive. Minority Language Speakers have the right to use their language in contact with authorities in those regions designated as so-called administrative areas of language.

23.4.1.4 YIDDISH Yiddish has the status of national minority languages in Sweden. The status of minority languages means that Sweden takes a cultural-political responsibility for these languages to survive. Minority Language Speakers have the right to use their language in contact with authorities in those regions designated as so-called administrative areas of language.

23.4.2 CURRENT NATIONAL POLICIES AND STRATEGIES Sweden has since 2009 a language law that regulates the and language status in Sweden. The objectives of Swedish language policy are spelled out in the Language Act (2009: 600), which entered into force on 1 July 2009. The law aims primarily to clarify the Swedish language and other language's position in Swedish society. Authorities and other public bodies have a particular responsibility for the use and development. The same responsibility applies to the promotion of the national minority languages and . The Language Act is a framework law which sets out the principles, goals and guidelines for the use of languages. Public organizations must comply with the language law, but there are no penal provisions for violating it. It can, however, lead to trials of such Ombudsman. The Ombudsman can issue criticisms of activities contrary to the language law.

Language Policy

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 101 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Everyone should have the right to language. It is one of the four overriding objectives of the Swedish language policy. The Swedish Parliament decided on 7 December 2005 on an integrated Swedish language policy with four overarching goals:  Swedish language should be the main language in Sweden.  Swedish should be complete and socio-supporting languages.  Public Swedish is to be cultivated, simple and comprehensible.  Everyone should have the right to language: to develop and learn Swedish, to develop and use their own mother tongue and national minority language and to have the opportunity to learn foreign languages.

The Language Law and Minority Law Sweden has since July 1, 2009 a language law that consolidates the language policy. Since 1 January 2010 there is also a law on national minorities and minority languages. Language policy applies to all languages in Sweden. Swedish is the language policy and language law's main language and the socio-bearing language. Also people who have other languages as their mother tongue through immigration are covered by the language policy and applicable law.

24. UK – UNITED KINGDOM 24.1 BACKGROUND 24.1.1 COUNTRY CHARACTERISTICS The United Kingdom of Great Britain and Northern Ireland, commonly known as the United Kingdom (UK) and Britain, is a country in northern Europe that consists of England, Wales, Scotland, and the province of Northern Ireland which occupies the north-eastern part of the island of Ireland.

The UK is located on an archipelago known as the British Isles with the main islands of Great Britain and Ireland, and the surrounding island groups of the Hebrides, the Shetlands, the Orkneys, the Isle of Man and the Isle of Wight. The UK is located off the northern coast of France and west of Sweden and Denmark, between the North Sea and the North Atlantic Ocean.

The capital city is London with a metropolitan population about 7.8 million. UK is comprised by administrative divisions: 47 counties, 7 metropolitan counties, 26 districts, 9 regions, and 3 islands areas.

24.1.2 LANGUAGES OVERVIEW SITUATION England has a population of 51.8 million people of which 16% belong to an ethnic minority group or are of mixed race. It is favoured linguistically not only by having a major world language – English – as its official language but also by a very high degree of linguistic diversity – the latest survey in London found 233 distinct languages.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 102 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

England has only one recognised regional minority language – Cornish, used to some degree by several hundred people (2008).

Wales has a population of 3 million. In 2001, 20.8% (582,000) of them could speak Welsh, according to the census. Conquered by England in 1282, the 1563 Act of Union banned those using the from holding public office. The majority of the population of Wales continued to speak Welsh until late in the 19th century. Extensive immigration, mostly from England and Ireland due to the industrial revolution, coupled with the virtual exclusion of Welsh when compulsory education was introduced, led to a decline in the numbers and proportion of Welsh speakers, and a contraction of the area where Welsh was widely spoken. In 2001, 75,000 Welsh speakers lived in the three cities covered by our LRE research, representing 12% of their total population.

Scotland has a population of 5.22 million people of which 92,000, or just fewer than 2%, have some knowledge of Gaelic. Scotland has been attracting inward migration since 2002: the 2001 census showed a 2% non-white ethnic minority with the majority being of Pakistani origin, but by 2009 a national pupil survey showed 4.3% of school children mainly used a language other than English at home. Altogether, 138 languages were recorded as having been spoken, with Polish at the head of the list with 0.8% of the school population, followed by Panjabi, Urdu, Arabic, Cantonese, French and Gaelic respectively. 626 pupils were registered as speaking mainly Gaelic at home, slightly less than one in 1,000. However, many more are receiving Gaelic medium education or are being taught Gaelic through the medium of Gaelic – 4,064 in 2011, the equivalent to one in every 180 pupils.

Northern Ireland has a population of 1.8 million people. While English is the vernacular, the 2001 census found that 10% of the population reported ‘some knowledge’ of Irish. Since the stabilisation of the political situation in the late 1990s the country has attracted an increasing number of immigrants. Following the 2001 census, the most significant language groups were identified as Chinese, Arabic and Portuguese. However, more recent immigration from the Accession Eight (A8) countries of the European Union has given Polish, followed by Lithuanian, a significant presence. Currently 3% of primary school children have a language other than English as their first language; rising to 11% in Dungannon, the most diverse district. Additionally to these 3 languages there are other minority languages recognized by the European Charter for Regional or Minority Languages like:  Cornish: is one of the Brittonic languages, which constitute a branch of the Insular Celtic section of the Celtic language family. In the 2011 UK census, 557 people in England and Wales declared Cornish to be their main language, 464 of whom lived in Cornwall.  Scots: is the Germanic language variety spoken in Lowland Scotland and parts of Ulster (where the local dialect is known as Ulster Scots). The 2011 UK census was the first to ask residents of Scotland about Scots. A campaign called Aye Can was set up to help individuals answer the question. The specific wording used was "Which of these can you do? Tick all that apply" with options for 'Understand', 'Speak', 'Read' and 'Write' in

three columns: English, Scottish Gaelic and Scots. Of approximately 5.1 million respondents, about 1.2 million 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 103 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

(24%) could speak, read and write Scots, 3.2 million (62%) had no skills in Scots and the remainder had some degree of skill, such as understanding Scots (0.27 million, 5.2%) or being able to speak it but not read or write it (0.18 million, 3.5%).  24.1.3 REGULATION FRAMEWORK English it is not an official language in the UK, since there is no formal constitution. However, it can be considered the de facto language, given that it is the official language of the British government, and is spoken by around 94% of the 62 million inhabitants of the UK. English is spoken in almost 60 sovereign states, the most commonly spoken language in the UK, the United States, Canada, Australia, Ireland and New Zealand, being widely spoken language in countries in the Caribbean, Africa, and South Asia. It is the third most common native language in the world after Mandarin and Spanish. It is widely learned as a second language and is an official language of the United Nations, of the European Union and of many other world and regional international organisations.

Gaelic has a special statute under British Law that provides certain measures for preserving the language. In fact, The Gaelic Language (Scotland) Act 2005 was passed by the Scottish Parliament with a view to securing the status of the Gaelic language as an official language of Scotland commanding equal respect to the . With respect to Welsh, there is a language law consolidating its official position in Wales since 1993. The Welsh language is considered as an equal of English in Wales according to the Section 78 of the Government of Wales Act 2006.

Regarding the Irish, the UK has made a number of binding commitments in relation to the Irish language in Northern Ireland under Part III of the the European Charter for Regional or Minority Languages. Under Article 10 of the Charter these include, where justified, duties for public services to:  provide for speakers to submit oral or written applications in Irish;  allow public authorities to draft documents in Irish;  permit/encourage the use of Irish as well as English in debates in Council chambers/the Northern Ireland Assembly;  permit/encourage the use of traditional and correct forms of place names in Irish (in with English if needed).

Also of particular relevance to local government are commitments under Article 12 in respect of public authorities that have a role in the field of cultural activities and facilities

24.2 RELEVANT ORGANISATIONS In UK there are a high number of research institutions that work in language technologies like: University of Sheffield, University of Leeds, University of Cambridge and the University of Edinburgh. Additionally, there are

strong language services and technology providers. Some of the most relevant are: SDL, RWS Holdings PLC, 644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 104 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Thebigword Group, Hogarth Worldwide, Aplha CRC, Applied language solutions, Lingo24, Wordbank, Translation Empire and Sandberg Translation Partners / STP Nordic

The main institutions responsible for overseeing the official and co-official languages in the UK are: English:  The British Council is the UK’s international organization for cultural relations and educational opportunities. http://www.britishcouncil.org/  EPSRC is the UK's main agency for funding research in engineering and physical sciences. https://www.epsrc.ac.uk/  The English Association which aim is to promote the English language http://www2.le.ac.uk/offices/english- association  The European Society for the Study of English (ESSE) promotes the study and understanding of English languages, literature and cultures of English speaking people within Europe. http://www.essenglish.org/

Gaelic:  Scottish government - Learning, Science and Scotland’s Languages. www.gov.scot/About/People/Ministers/Cabinet-Secretary-for-Education-and-Lifelong-Learn  Bòrd na Gàidhlig is the executive non-departmental public body of the Scottish Government with responsibility for Gaelic. http://www.gaidhlig.org.uk/bord/en/the-bord/about-bord-na-gaidhlig/

Welsh  The Partnership Council is responsible for giving advice and making representations to Ministers in relation to the Welsh language strategy. It is made up by expert from university, industry and government. http://gov.wales/topics/welshlanguage/welsh-language-partnership-council/?lang=en  Welsh Government – Welsh language Unit. www.wales.gov.uk/welshlanguage  The Coleg Cymraeg Cenedlaethol (National Welsh Language College) was established in 2011. http://www.colegcymraeg.ac.uk/en/

Irish  The Department of Culture, Arts and Leisure (DACL). UK Government http://www.dcalni.gov.uk/index.htm  British-Irish Council – Indigenous, minority and lesser-used languages www.britishirishcouncil.org/areas- work/indigenous-minority-and-lesser-used-languages  Comhairle na Gaelscolaíochta (CnaG) is the representative body for Irish-medium Education. It was set up in 2000 by the Department of Education to promote, facilitate and encourage Irish-medium Education. www.comhairle.org/english  Foras na Gaeilge, the body responsible for the promotion of the Irish language throughout the whole island of Ireland. www.gaeilge.ie/about-foras-na-gaeilge/?lang=en

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 105 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

FIGURE 12 LANGUAGES OF THE UK

24.3 NATIONAL AND REGIONAL POLICIES AND STRATEGIES 24.3.1 PAST AND CURRENT SITUATION English UK support for language technology research began in the 1990s with the Department of Trade and Industry’s four-year Speech and Language Technology (SALT) programme, which funded a wide range of small collaborative projects in the field. Since that period, there has not been a major programme specifically involving language technology support, although there are numerous public sources of research, development and technology transfer funding.

On the other side, England’s lack of ‘national capability’ in languages has been a matter of considerable debate in recent years and, in particular, since the Nuffield Languages Inquiry of 2000. At policy level and in public discourse, languages are described as important, but in practice and provision there have been many fault lines. This is undoubtedly a reflection of the growing importance of English as a lingua franca and a continuing perception that ‘English is enough’ and that the other languages are ‘important but not essential’. Despite this, there has been significant progress and innovation in introducing the early learning of other languages, in supporting community languages, and in promoting language competence to young people. Partly as a result of this, languages remain on the political agenda – the case is not closed.

The National Languages Strategy (2002–2011) was responsible for a number of key initiatives, especially the creation of a framework for language learning for ages seven to 11 (The Key Stage 2 framework for languages) and a new assessment framework (The Languages Ladder/Asset languages) based on the CEFR (Common

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 106 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

European Framework of Reference for Languages). It also supported links between mainstream and complementary schools such as the Our Languages initiative.

“Routes into Languages”, managed by the University of Southampton, has targeted secondary school students with messages about the importance of language learning through direct engagement with universities and student ambassadors. It has brought universities into contact with schools and developed some highly successful models of collaboration.

The 2011 report “Labour Market Intelligence on Languages and Intercultural Skills in Higher Education” (CILT) demonstrated the need for a wide range of languages across both public and private sectors in combination with different workplace skills. In 2011 a new campaign was launched to support language learning - Speak to the Future. This has built a broad coalition of support around five key issues to promote the importance of language skills and bring about changes in policy and attitudes.

Language technologies are not considered inside these strategies quoted before, but they have considered as part of Funding programs inside the EPSRC. The EPSRC is the UK's main agency for funding research in engineering and physical sciences, including information technology. The EPSRC provides funding to different research area. The “Natural Language Processing” is a research area that aims to derive meaning from human language or to generate human language to enable communication, and hence encompasses both language understanding and language generation. NLP typically uses statistical and linguistic methods to achieve its aims. Applications could include: the translation of text to another language; data extraction from text; answering questions about the contents of the text; paraphrasing an input text; dialogue; sentiment analysis. The main figures in this research area are:  Relevant grants: 33  Proportional value: £5,430,807  % of EPSRC portfolio: 0.12 %

Gaelic The Scottish Government recognises that Gaelic is an integral part of Scotland’s heritage, national identity and current cultural life. The Scottish Government has taken action and has put in place the necessary structures and initiatives to ensure that Gaelic has a sustainable future in a modern and vibrant Scotland. However, the position of Gaelic remains extremely fragile. If Gaelic is to have a sustainable future, there needs to be a concerted effort on the part of Government, the public sector, the private sector, community bodies and individual speakers to:  promote the acquisition of speaking, reading and writing skills in Gaelic  enable the use of Gaelic in a range of social, formal and work settings  expand the respect for, and visibility, audibility and recognition of Gaelic

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 107 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 develop the quality, consistency and richness of Gaelic

Gaelic enjoys a high level of political support with the Gaelic Language Plan. The first Gaelic Language Plan was defined for the period 2012 – 2015. Section 3 – Gaelic language development of this Plan was elaborated thinking on how Gaelic learning and the use of the language in various contexts can be promoted through the delivery of council services. Specifically, inside this section, it is included the sub-section “Language Corpus”, in charge of promoting the consistency and strength of the language for example by supporting translation facilities and the development of new terminology and place names.

After this plan it has been seen an increase in the number of children going through Gaelic education, an increase in the number of Gaelic schools and units, a growth in the number of public authorities making commitments through their Gaelic Language Plans and BBC Alba broadcasting on Free view, while supporting the independent programme making sector in Scotland. Of course, arguably the most significant outcome was the release of latest Census figures which, though showing a further decline in older age groups with the Gaelic language, showed an increase in all age groups below 20. This demonstrates that policies that support Gaelic education are making a difference.

During this year it has been launched the Gaelic Language Plan 2015 – 2020. In this plan, it again appears as Development Area; the Corpus Area that includes initiatives focusing on terminology, translation, orthography and place-names for the purpose of ensuring Gaelic continues to develop and to achieve greater strength, relevance, consistency and visibility.

In none of the Gaelic Language Plan, there is a clear reference to the support of language technologies, only it appears, the support to the Corpus Area described before.

A relevant project for the Gaelic language has been the TòMaS project (http://www.uhi.ac.uk/en/lis/tomas ), the first ever translation memory service for Gaelic, aims to speed up the translation process and ensure greater consistency across texts. It has been developed by the University of Highlands and Islands.

Welsh Welsh Language Strategy 2012-17 has been prepared in accordance with Section 78 of the Government of Wales Act 2006. The Government's vision is to see the Welsh language thriving in Wales. To achieve that, the strategy aims to see an increase in the number of people who both speak and use the language. This is a five-year strategy, from 1 April 2012 to 31 March 2017, which supersedes Iaith Pawb published in 2003.

The main strategic areas are:  Use of Welsh in the family

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 108 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Children and young people  The community  The workplace  Welsh language services  Infrastructure

The area 6: “Infrastructures” aims to strengthen the infrastructure for the language (including translation, publishing, research, television and radio – and ICT). The corresponding themes of the action plan related to this area are:  Marketing and raising awareness  Work with the main technology companies  Encourage development of new technology  Encourage creating and sharing Welsh digital content  Promote good practice

For encouraging the development of new technologies the activities are:  fund the development of Welsh language programmes/Apps for all types of devices  commission projects to develop services, voice recognition, machine translation  fund projects to develop young people’s coding skills

The instrument for supporting the development of these technologies was the: Welsh-language Technology and Digital Media Fund, which have the next features:  An annual Fund of £250,000 to invest in Welsh technology  £150,000 per annum - Grant Scheme  £100,000 - Procurement programme

This program started on 27 May 2013.

Some examples of the projects funded by this program are:  Interceptor Solutions Ltd - Enabling multilingual software interfaces and measuring impact: Lingua-Skin is a piece of software developed by Interceptor Solutions Ltd which enables other software to easily become bilingual without requiring changes to the underlying framework and is transparent to the end-user. This project proposes to pilot Lingua-Skin through both provision of a free-of-charge implementation package to early adopters and awareness raising activities. The project also includes an objective review by Cardiff University on the effectiveness of this type of approach to delivering bilingual interfaces

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 109 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

 Canolfan Bedwyr – GALLU: Gwaith Adnabod Lleferydd Uwch: This project will further strengthen the current Welsh language technology infrastructure by delivering a new Speech to Text module. Canolfan Bedwyr’s Language Technologies Unit, Bangor University, will work on the next stage in the development of Welsh language speech recognition technology by crowd-sourcing a corpus of Welsh language speech. Strategically, this is a much needed resource which could facilitate the inclusion of Welsh in future developments of speech- recognition based products such as office suites and voice-activated products. The project is in cooperation with S4C.  Speech and Language Technology Unit, Canolfan Bedwyr, Bangor University – Welsh communications infrastructure: Laying the foundations for a range of Welsh language free and open source communications technologies, including transcription, voice command and control, question answering, and speech to speech translation.

Additionally, there is a Welsh-medium Education Strategy. This Strategy is a historic milestone in Welsh-medium education and sets the Welsh Government's national strategic direction. It also sets the direction for making improvements in the teaching and learning of Welsh as a language, including, Welsh as second language. In 2007 the Welsh Government committed to 'create a national Welsh-medium Education Strategy to develop effective provision from nursery through to further and higher education, backed up by an implementation programme'. In response to this commitment the Welsh-medium Education Strategy was launched in April 2010

Northern Ireland The Northern Ireland Executive in its Programme for Government 2011-2015, has included a Strategy for the Irish Language as a key building block under Priority 4 ‘Building a Strong and Shared Community’. This follows agreements between the British and Irish Governments, which led to the Northern Ireland (NI) Act 1998 being amended in 2006 to include a requirement for the Executive to “adopt a Strategy setting out how it proposes to enhance and protect the development of the Irish language”.

This Strategy sets out plans to enhance and protect the development of the Irish language over the next 20 years, taking account of the needs of the Irish language community and international best practice.

The key aims of this Strategy are:  Support quality and sustainable acquisition and learning of the Irish language.  Enhance and protect the status and visibility of the Irish language.  Deliver quality and sustainable Irish language networks and communities.  Promote the Irish language in a way that will contribute towards building a strong and shared community.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 110 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

Seven key areas for action have been identified that will benefit the Irish language. Action in each of these areas will contribute towards the achievement of this Strategy’s objectives. The areas for action are:  Education  Family Transmission of the Irish Language – Early Intervention  The Irish Language and the Community  Public Services  Media and Technology  Legislation and Status of the Irish Language  Economic Life

Despite the area for action in Media and Technology, the support or promotion of language technologies is not clearly stated in the strategy.

24.3.2 FUTURE STRATEGIES AND POLICIES English According to the interviews and office research carried out, at the moment there is no knowledge, about the development of new strategies related to language technologies. Initially, they will be continued to be considered as a research area inside the EPSRC.

Gaelic There is a Plan until 2020, and it is expected that after that plan a new plan will be defined. But it is not possible to know now, if in this new plan the language technologies will have a relevant role. Some interviews with persons responsible for the strategy translated that the relevance of the language technologies is starting to be considered, but at the moment there is not any legal or published document that could confirm this intention.

Welsh The Welsh Language Strategy is alive until 2017, therefore it is expected that the action plan will be carried out. The Welsh-language Technology and Digital Media Fund will also continue until 2017 providing grants for projects aiming at the development of technologies and Digital media products and services.

Northern Ireland The Department of Culture, Arts and Leisure (DCAL) for Northern Ireland, according to our interview with some responsible for the strategy, indicates that they have no language technology strategies for the Irish language.

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 111 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

25. ANNEX Sources and References (if not covered by footnotes within the text):

25.1 BY COUNTRY 25.1.1 BELGIUM: Dr. Peter Spyns, Flemish Government – Department of Economy, Science & Innovation Personal communication with Karl-Heinz Lambertz, President of the Parliament of the German-speaking community of Belgium

25.1.2 ESTONIA:  http://www.bbc.com/news/world-europe-17220810 (general info on Estonia)  Language Act: - https://www.riigiteataja.ee/en/eli/522062015005/consolide  Development Plan of the Estonian Language (2011-2017):  http://ekn.hm.ee/system/files/Eesti+keele+arengukava+inglise.indd_.pdf  National Program of the Estonian Language Technology -web-site:  https://www.keeletehnoloogia.ee/en?set_language=en;  Downloadable programme text in English: https://www.keeletehnoloogia.ee/en/npelt-text/view

25.1.3 FRANCE: Personal communication with Christian Tremblay (OEP) and Mariani.

25.1.4 GERMANY: Personal communication with Wolfgang Mackiewicz, Freie Universität Berlin

25.1.5 IRELAND: 20-YEAR STRATEGY FOR THE IRISH LANGUAGE 2010 – 2030

25.1.6 ITALY:

25.1.7 HIT2020 - HORIZON2020 ITALIA HTTPS://WWW.RESEARCHITALY.IT/UPLOADS/50/HIT2020.PDF?=1B9FFE7

25.1.8 THE NETHERLANDS:  https://www.holland.com/us/press/facts-figures-1/the-netherlands/facts-figures-about-the-netherlands.htm  http://www.lrec-conf.org/proceedings/lrec2006/pdf/259_pdf.pdf (STEVIN programme)  http://www.meta-net.eu/whitepapers/volumes/dutch

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 112 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

25.1.9 POLAND:  http://www.bbc.com/news/world-europe-17753718  http://www.usefoundation.org/view/477  http://ksng.gugik.gov.pl/english/files/act_on_national_minorities.pdf

25.1.10 UK: http://www.britishirishcouncil.org/areas-work/indigenous-minority-and-lesser-used-languages

25.1.11 GENERAL  META-NET White Paper Series: http://www.meta-net.eu/whitepapers/overview  NPLD http://www.npld.eu/  The EUROMAP Study Joscelyne and Rose Lockwood Copenhagen 2003 - Benchmarking HLT progress in Europe  ERRIN: www.errin.eu

25.2 PRACTICAL INFORMATION  Language resources in all official languages can be found at:  http://ec.europa.eu/translation/index_en.htm  European companies active in the area of language technologies can be found at: http://www.lt-innovate.org/directory/members

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 113 | 114

LT_OBSERVATORY – OBSERVATORY FOR LR AND MT IN EUROPE LT_OBSERVATORY

26. COPYRIGHT POLICY This document was elaborated in the course of the LT Observatory project, a support action of Horizon 2020. Text can be quoted by mentioning the source: LTO National Language Policy Document 2015 Copyright: Creative Commons

Attribution-NonCommercial-ShareAlike 4.0 International For commercial use, please contact us: http://www.emfs.eu/contact

644583 | DELIVERABLE D3.1 This project is co-funded by the European Union 114 | 114