Journal of Software Automatic Linking of Short Arabic Texts to Wikipedia Articles Fatoom Fayad1*, Iyad AlAgha2 1 Computer Center, Palestine Technical College-Deir El-Balah, Gaza Strip, Palestine. 2 Faculty of Information Technology, The Islamic University of Gaza, Gaza Strip, Palestine. * Corresponding author. Tel.: 00972592542727; email:
[email protected] Manuscript submitted January 10, 2016; accepted March 8, 2016. doi: 10.17706/jsw.11.12.1207-1223 Abstract: Given the enormous amount of unstructured texts available on the Web, there has been an emerging need to increase discoverability of and accessibility to these texts. One of the proposed solutions is to annotate texts with information extracted from background knowledge. Wikipedia, the free encyclopedia, has been recently exploited as a background knowledge to annotate text with complementary information. Given any piece of text, the main challenge is how to determine the most relevant information from Wikipedia with the least effort and time. While Wikipedia-based annotation has mainly targeted the English and Latin versions of Wikipedia, little effort has been devoted to annotate Arabic text using the Arabic version of Wikipedia. In addition, the annotation of short text presents further challenges due to the inability to apply statistical or machine learning techniques that are commonly used with long text. This work proposes an approach for automatic linking of Arabic short texts to articles drawn from Wikipedia. It reports on the several challenges associated with the design and implementation of the linking approach including the processing of the Wikipedia's enormous content, the mapping of texts to Wikipedia articles, the problem of article disambiguation, and the time efficiency.