Topic Keyword Identification for Text Summarization using Lexical Clustering

Youngjoong Ko, Kono Kim, and Jungyun Seo

Department of Computer Science Sogang University Sinsu-dong 1, Mapo-gu, Seoul Korea, 121-742 {kyj, kono}@nlpzodiac.sogang.ac.kr, [email protected]

Correspondence author: Youngjoong Ko

Correspondence address: NLP laboratory, Dept. of Computer Science, Sogang University, Sinsu-dong 1, Mapo-gu, Seoul, Korea, 121-742 Correspondence telephone number: 82-2-706-8954

Correspondence fax number: 82-2-704-8273

Correspondence Email number: [email protected]

Abstract Automatic text summarization sets the goal at reducing the size of a document while preserving its content. Generally, producing a summary as extracts is achieved by including only sentences which are most topic-related. DOCUSUM is our summarization system based on new topic keyword identification method. The process of DOCUSUM is as follows. First, DOCUSUM converts the content of a document into elements of a context vector space. It then constructs lexical clusters from the context vector space and identifies core clusters. Next, it selects topic keywords from the core clusters. Finally, it generates a summary of the document using the topic keywords. In the experiments on various compression ratio (the compression of 30%, the compression of 10%, and the extraction of the fixed number of sentences: 4 or 8 sentences), DOCUSUM showed better performance than other methods.

Keywords: text summarization, lexical clustering, k-means algorithm, topic keyword identification

Topic Keyword Identification for Text Summarization using Lexical Clustering

Youngjoong Ko, Kono Kim, and Jungyun Seo

Department of Computer Science Sogang University Sinsu-dong 1, Mapo-gu, Seoul Korea, 121-742 {kyj, kono}@nlpzodiac.sogang.ac.kr, [email protected]

We developed DOCUSUM which makes an extracts using topic keyword identification. Automatic text summarization sets the goal DOCUSUM is a text summarization system at reducing the size of a document while based on IR techniques using semantic and preserving its content. Generally, producing statistical methods. For example, DOCUSUM a summary as extracts is achieved by uses not only counting but also topic including only sentences which are most keyword identification through context vector topic-related. DOCUSUM is our space. The context vector space is automatically summarization system based on new topic constructed using co-occurrence statistics and keyword identification method. The process statistical methods are utilized with this of DOCUSUM is as follows. First, contextual knowledge. DOCUSUM identifies DOCUSUM converts the content words of a topic keywords without other linguistic resources document into elements of a context vector such as the WordNet. It also does not use cue space. It then constructs lexical clusters phrases or discourse parser for the topic keyword from the context vector space and identifies identification. Therefore, these characteristics core clusters. Next, it selects topic keywords make DOCUSUM robust and low-cost. from the core clusters. Finally, it generates a The rest of this paper is organized as follows. summary of the document using the topic Section 2 describes related works in topic keywords. In the experiments on various identification. Section 3 explains the structure of compression ratio (the compression of 30%, DOCUSUM in overall manner. In Section 4, we the compression of 10%, and the extraction present each stage from the construction of of the fixed number of sentences: 4 or 8 context vector space to the generation of sentences), DOCUSUM showed better summary in detail. Section 5 is devoted to performance than other methods. evaluating experimental results. In the last Section, we draw conclusions and present future 1 Introduction works.

The goal of automatic summarization is to take 2 Related Work an information source, extract content from it, and present the most important content to a user Several techniques for topic identification have in a condensed form and in a manner sensitive to been reported in the literature. The pioneering the user’s or application’s needs [1]. To achieve work studied that most frequent words represent this goal, topic of an information source should the most important concepts of the text [2]. be identified by machine as well as human. Using Therefore, topic keywords are the most frequent identified topics (theme), we can extract words. This representation abstracts the source sentences from a source and generate abstracts. text into a frequency table. This method ignores Therefore, one of main goals for our method is to the semantic content of words and their potential identify main topics of texts. These topics can be membership in multi-word phrases. represented as topic keywords.

In other early summarization system, problem, DOCUSUM proceed in four stages: Edmundson studied that first paragraph or first First, DOCUSUM converts the content words of sentences of each paragraph contain topic a document into vectors of a context vector space. information [3]. Also he studied that the presence Second, it constructs lexical clusters using the of words such as significant, hardly, impossible context vector space and identifies core clusters. signals topic sentences. Although all the Third, it selects topic keywords from the core techniques presented above are easily computed, clusters. Finally, it generates the summary of the these approaches depend very much on the document using the topic keywords. particular format and style of writing. The following Figure 1 shows the architecture To overcome the limitation of the frequency- of DOCUSUM. based method, Aone et al. aggregated synonym occurrences together as occurrences of the same concept using the WordNet [4]. Training Co-Occurrence Analysis News Articles Process Bazilay and Elhadad constructed Context Vector Space by calculating semantic distance between words using the WordNet [5]. Strong lexical chains are selected and the sentences related to these strong chains are chosen as a summary. Generation Content Words Extraction Document Process These methods which use semantic relations Lexical Clustering between words depend heavily on manually Topic Keyword Identification constructed resources such as the WordNet [6]. The WordNet is not available in several Topic Keywords Title Query languages such as Korean and this kind of Document Vector Space linguistic resources are hard to maintain. Hovy and Lin used topic identification which aimed at extracting the salient concepts in a Summary document [7]. By training on a corpus of Figure 1. Illustration of DOCUSUM documents with their associated topics, their method yields a ranked list of sentence positions The first stage represents each word as a vector. that tend to contain the most topic-related The meaning of a word can be represented by a keywords. They also used topic signature method vector, which places a word in a for topic identification. To construct topic multidimensional semantic space [8]. The main signature, they used a set of 30,000 texts where requirement of such spaces is that words which each article is labeled with one out of 32 possible are similar in meaning should be represented by topic labels. For each topic, the top 300 terms, similar vectors. DOCUSUM uses a context which are scored by a term-weighting metric, vector space based on a co-occurrence analysis of were treated as a topic signature. large corpora. The assumption is that similar Topic keywords in DOCUSUM are close to words occur in similar contexts. For example, a topic signature conceptually and both methods textbook with a paragraph about ‘cats’ might also use very large corpus. However, DOCUSUM is mention ‘dogs’, ‘fur’, ‘pets’ and so on. This different in that no genre-related and no knowledge can be used to assume that ‘cats’ and supervised training are necessary and topic ‘dogs’ are related in meaning. For about 60,000 keywords are identified with context vector words, the co-occurrence statistics were space. calculated in a sentence window, which was slid from a corpus of about 20 million words 3 Overall Architecture (newspaper articles for 2 years). Each word is represented by the co-occurrence value with the In order to produce a summary with high quality, other words. If any two words in the context topic of a document should be recognized and it vector space have a similar co-occurrence pattern, can be represented as a few topic keywords. A the meanings of these words are likely to be the few topic keywords from a document are a good same. As a result, the similarity of meaning representative form for topic. The problem is between two words increases in proportion to the how to identify the topic keywords. To solve this inner product value of vectors for these words.

The content words written in the documents Each word in the context vector space is are converted into the vectors of the context represented as the co-occurrence value with other vector space. In the second stage, the converted element words as shown in Figure 2. If any two vectors are clustered by k-means algorithm. Then words have a similar co-occurrence pattern, the DOCUSUM identifies core clusters. To identify meaning of these words is similar. In the context them, we developed scoring measure for cluster. vector space, the meaning similarity between The score of a cluster is determined by the words can be calculated by inner product or normalized sum of term frequency within the cosine metric. Thus the meaning similarity cluster. The core clusters can be identified by this between two words increases in proportion to the cluster score. inner product value of two vectors. In the third stage, DOCUSUM selects topic In order to build the context vector space, a keywords with strong relations to the topic of co-occurrence value of words is required. To document. These topic keywords are selected calculate these co-occurrence values, the articles from core clusters by term frequency. of Chosun Ilbo newspaper from 1996 to 1997 After finding the topic keywords, we use them (approximately 16,600,000 words and 1,538,320 as a query. Candidate sentences for a summary sentences) are used. are extracted by a topic keywords query and a After all sentences of the articles are tagged by title query respectively. The summary is morphological analyzer, only a proper noun and extracted from the candidate sentences by the a common noun are used for calculating the following criteria. co-occurrence statistics. In the analysis of co-occurrence, low frequency words (less than 3) 1. The candidate sentences extracted in common are eliminated. One sentence is used for the size by each query. of sliding window in order to measure 2. If the sentences from 1 are not enough for a co-occurrence frequency. To reduce noise, summary, the candidate sentences located in unrelated word pairs are eliminated according to leading position. the mutual information score as shown in

Equation 1. A no-title document is possible to be summarized by only topic keywords query. Pr( x  y ) I ( x , y )  log (1) Pr( x )  Pr( y ) 4 DOCUSUM where Pr(x) is the probability of word x occurring, Pr(y) is the probability of word y occurring, and 4.1 Context Vector Space Pr(x^y) is the probability of word x and y occurring at the same time. The word pairs with DOCUSUM uses a context vector space in order mutual information score less than 3 are dropped. to represent words as a vector. This context Through this progress, a total of 2,429,342 noun vector space is very similar to Hyperspace pairs were created. Each element of context Analogue to Language (HAL); the HAL model vector is represented as the co-occurrence value developed high dimensional vector of two words by the Equation 2. representations for words [8]. n x y  i i cos( X , Y )  i 1 (2) n n 2 2 x y  i  i i 1 i 1

Figure 2. An Example of Context Vector Space When where xi and yi are the term frequency of words X Document Words = {shop, price, virtual}, The Element and Y in sentence i respectively, and n is the total words of Vector = {market, company, plan, game, space} number of sentences in the corpus. This method can be considered to calculate the similarity score

between words, based on the degree of their same cluster. The important score is then co-occurrence in the same sentences. assigned to each cluster. When 60,000 words are vectorized with the The assumption that topic keywords are in other 60,000 words, we need a 60,000 by 60,000 core clusters is very similar to the idea of matrix. It can make a trouble at the running speed Barzilay and Elhadad, in that strong lexical and memory space. As a result, it can be difficult chains have strong relations to express document to be applied at the practical application. topic [5]. Lexical chains are the structure of According to the research results for the context similar words in a document [9]. Lexical clusters vector representation in [8], a similar have different semantic categories while they are performance was reported with the vector of only more loosely connected than lexical chains. 200 dimensions. Therefore, this shows that it is DOCUSUM can recognize the topic of a possible to reduce dimensionality of each word document into several categories by clustering vector. words and selecting the most topic-related To reduce the dimension of vector in our clusters. system, we evaluated the performance of For clustering words, words with proper or dimensionality reduction in various vector common noun POS tags are extracted from a dimensions from 100 elements to 800 elements. given document for vectorization. They are The vector element words for representing words vectorized in pre-built context vector space. were selected from Chosun Ilbo newspaper These word vectors are clustered using k-means according to high term frequency. The reason algorithm and inner product is used for similarity why we selected the elements according to high measure in k-means algorithm. term frequency is that word pairs with low k-means algorithm has been known as very frequency have a tendency not to hold enough effective algorithm for unsupervised clustering. co-occurrence information. The best To use k-means algorithm, we need to set initial performance was obtained at 600 dimensions as number of clusters. We assumed that the number shown in Figure 3. of clusters increases according to the number of distinct words in a document. To determine the initial number, we experimented according to 0 .3 8

0 .3 7 5 various k value of k-means algorithm. Here we 0 .3 7 set k value to a quotient of the total number of 0 .3 6 5

0 .3 6 words in a document divided by the 0 .3 5 5 pre-determined number (from 20 to 100); these 0 .3 5

0 .3 4 5 pre-determined numbers mean the word number F 1 m e a s u re 0 .3 4 within a cluster. 0 .3 3 5

0 .3 3 As a result, we achieved the best performance 0 .3 2 5 at the number of clusters when using pre- 100 200 300 400 500 600 700 800

Dimension of Word Vector determined number, 60, as shown in Figure 4.

Figure 3. Performance with Various Vector F1 m easure with various k-values Dimension F1 m easure using the m ost frequent word 0 .3 7 5

0 .3 7 In Figure 3, we use F1 measure on validation set 0 .3 6 5 0 .3 6 for performance. The construction of context 0 .3 5 5 vector space is a one-time task in training 0 .3 5 0 .3 4 5

process. 0 .3 4 F 1 m e a s u re 0 .3 3 5 0 .3 3

0 .3 2 5

0 .3 2 4.2 Lexical Clustering 20 30 40 50 60 70 80 90 100 The ratio of clusters (number of number) The goal of building context vector space is to extract topic keywords. However, we cannot Figure 4. Performance with Various k Values extract topic keywords from context vector space directly. Therefore, DOCUSUM clusters the In above Figure 4, F1 measure on validation set is words so that similar words are aggregated in the also used as performance. To verify our lexical

clustering method, we build a method using the Title has been used for a query usually. most frequent word as a topic keyword (the DOCUSUM however uses two queries for number of clusters per document is set to one) . One query is generated from and compare with our method. As shown in title and the other is generated from topic Figure 4, our method outperforms in all the ratio keywords which are identified by lexical of clusters. clustering. Candidate sentences for a summary To select more topic-related clusters, the are extracted by a topic keywords query and a following Equation 3 is used. title query respectively. The summary is composed of sentences extracted by the n following two steps. In the first step, we first tf  i extract the candidate sentences in common by Score(Clus ter)  i 1 (3) each query. Then if we cannot obtain enough n number of sentences, we extract the candidate sentences located at leading position in the Where tf is the term frequency of the word, n is second step. By this method, DOCUSUM makes the number of distinct words in a cluster. The a summary oriented to title and topic keywords. cluster score increases in proportion to the The inner product metric is used as the average frequency of words in the cluster. This similarity measure between the query and scoring measure is based on the notions that topic sentences in DOCUSUM. To represent sentences, keywords are more frequent. Here, DOCUSUM only proper and common nouns are used after aggregated synonyms occurred together in the eliminating stop words. In sentence vectorization, same concept using lexical clustering without Boolean weighting is used as follows. other resources such as WordNet. After scoring clusters, the cluster with top S  ( w , w , w ,..., w ) score is selected as core cluster. The clusters with i i1 i 2 i 3 ik less than 10% margin with the score of top cluster (4) are also selected as core cluster. This strategy for 1 if tf ik  0 cluster selection makes our method flexible at w ik   0 otherwise document style and various compression rate of  summarization. In our experiment, we use 20% margin for the multi-topic style data set. These Where tfik is the term frequencies of word k in threshold values are set by experiments using sentence i, and Si is sentence vector. In general, validation set. tf.idf representation has shown better performance in . However, 4.3 Topic Keyword Identification binary representation has generally showed better performance in summarization [11]. The following stage selects topic keywords from By the following Equation 5, we calculate the core clusters. In topic keyword identification, the similarity of two sentence vectors in document term frequency of the word is used as a score. To vector space model [12]. select topic keywords from core clusters, we use the same method as selecting core clusters. Topic n keywords are the top frequency words within sim ( S , S )  w w core clusters. The words with less than 10 % i j  ik jk (5) k 1 margin with the top frequency word are also selected as topic keywords. n is the number of words which is included in a 4.4 The Query-based Summarization document. wik is the weight of k-th word in i-th sentence. The longer sentences are likely to be A Query-based summarization makes a summary included as summary because similarity measure by extracting relevant sentences from a is inner product. document [10]. The criterion for extraction is Finally, we partially use several statistical given as a query. The probability of being features such as title, location, frequency, and included in a summary increases according to the length of sentence, and naturally combine them number of words co-occurred in the query and a into our system. Moreover, DOCUSUM uses sentence.

topic keywords as a method to overcome the test data from original data set. That is, we limitation of surface level statistics heavily combined two documents into one document. We depended on the text genre. call this new data set by MT-data (The Multi-Topic document data set) and original data set by ST-data (The Single-Topic document data 5 Empirical Evaluation set). Table 2 shows the settings of our experimental data set. 5.1 Data Sets and Experimental Settings Table 2. The setting of experimental data set Validation Set Test Data In our experiments, we used the summarization ST-data 269 547 test set of KOrea Research and Development MT-data 135 274 Information Center (KORDIC). This data is composed of news articles. The articles consist of Creating a summary at fixed length is more several genres such as politics, culture, appropriate than creating a summary at certain economics, and sports. Each test document has compression rate because a summary is not title, content, 30 % summary, and 10 % summary. related to the length of a document [10]. We The 30% and 10% summaries of the test hence experimented at various compression rates document are made by manually extracting (10%, 30%) and a fixed length (4 or 8 sentences). sentences from content. Although the size of To measure the performance of the method, F1 summarization test set was reported as 1,000 measure is used as the following Equation 6 documents [13]. In our experiments, we used 816 document-summary pairs after eliminating 2 ( P  R ) duplicate articles and inadequate summary pairs. F1  (6) Statistical features of this test set are as the P  R following Table 1. where P is precision and R is recall. Table 1. Statistical Features of Experimental Data Total number of documents 816 Total number of sentences 13,358 5.2 Other Summarization Methods Total number of sentences 1,348 at 10 % summary Our method is compared with the following 5 Total number of sentences methods 3,594 at 30 % summary Average number of sentences per Title Method: The score of sentences is 16.37 document calculated as how many words are commonly Average number of sentences used between the sentence and title. This at 30 % summary 4.40 calculation is acquired by a query from title in per document boolean weighted vector space model. Average number of nouns at title 6.78 Average number of nouns 11.97 Location Method: It has been said that the per sentence leading several sentences of an article are important and a good summary [15]. Therefore, To set the dimension of vector and the number of the leading sentences in compression rate or clusters in k-means algorithm, we used 269 fixed 4 or 8 sentences are extracted as a summary documents as a validation set. Hence, the test set by location method. in our experiments is composed of the rest of data set (547 documents). Frequency Method: The frequency of term Our test data set consists of documents with occurrences within a document has often been single topic. However, the summarization of used for calculating the importance of sentences documents with multiple topics is also handled [16]. In this method, the score of a sentence can importantly, because a document with a large be calculated as the sum of the score of words in topic can consist of a number of smaller topics the sentence. The score of important score wi of [14]. To verify our method in multi-topic word i can be calculated by the traditional tf.idf document summarization, we constructed new method as the follows [12].

Location 0.496 0.472 0.517 N Frequency 0.352 0.13 0.382 w i  tf i  log (7) Microsoft df 0.272 0.128 N/A i Word where tf is the term frequency of word i in the i Table 4. Experimental Results on MT-data document, N is the total number of texts, and dfi is the document frequency of word i in the whole Method 30% 10% 8 Sentences data set. DOCUSUM 0.451 0.360 0.494 Title 0.448 0.358 0.492 MMR: In MMR (maximal an marginal MMR (=0.7) 0.439 0.355 0.519 relevance), a sentence has high marginal DOCUSUM * 0.377 0.280 0.426 relevance if it is relevant to the query and Location 0.360 0.333 0.381 contains minimal similarity to previously Frequency 0.348 0.138 0.381 Microsoft selected sentences [17]. This method strives to 0.245 0.121 N/A maximize marginal relevance in summarization Word by the following Equation. * DOCUSUM used only a topic keywords query without a title query. DOCUSUM * showed lower Arg max [ sim 1 ( D i , Q )  (1   ) max ( sim 2 ( D i , D j ))] (8) D  R \ S D  S i j performance than title and MMR method; these methods use title as a query. Nevertheless, since where R is the ranked list of sentences created by DOCUSUM* showed better performance than initial query, D is a sentence in the ranked list, other methods not using the title query, it can be R/S denotes the set of as yet unselected sentences more useful for text summarization with no title in R, Q is a query, S is the subset of sentences in R and non-news genre. Note that we achieved already selected, Sim1 is the similarity metric better performance than Frequency method and used in ranking by initial query, and Sim2 is the Micro-Soft word on both data sets while we same similarity metric as Sim1 between sentences. could not outperform location method in ST-data. By intermediate values of  in the interval [0,1], a Since given test data is from news articles, the linear combination of both criteria is optimized. location method showed good performance in In our experiment, we set the value of  to 0.7. ST-data. It is certain that the leading several sentences of a news article are important and Microsoft Word: Microsoft Word, one of could be used as a good summary. commercial word processors, has a function for DOCUSUM using two queries showed the summarization. Thus we include the best performance at our experiments mostly. summarizing results of Microsoft word in our Since DOCUSUM also achieved these results experiments. without any manually constructed resources such as WordNet or discourse parser, our method can be more effective. 5.3 Experimental Results The following Table 5 shows the statistical The performance of DOCUSUM was features of lexical clustering at our experiment measured three times because DOCUSUM for 30% summary. conducts clustering by k-means algorithm. The results of DOCUSUM are the average values of these three experiment results. The experiment Table 5. Statistical Features in Experiments results are as shown in Table 3 and Table 4. ST-data MT-data Average number of clusters 2.98 4.36 per document Table 3. Experimental Results on ST-data Average number of core 1.38 2.32 Method 30% 10% 4 Sentences clusters per document Average number of topic DOCUSUM 0.503 0.522 0.538 1.79 3.03 Title 0.489 0.442 0.518 Keywords per document MMR (=0.7) 0.473 0.423 0.521 DOCUSUM * 0.443 0.415 0.467

6 Conclusions and Future Work using high-dimensional semantic space, Getting it right: The cognitive neuroscience DOCUSUM using two queries from title and of right hemisphere language topic keywords showed better performance than comprehension, Hillsdale, N.J. other methods. To substitute the knowledge of [9] J. Morris and G. Hirst, 1991, Lexical human, we constructed a context vector space by cohesion computed by thesaural relations as analyzing co-occurrence values from large an indicator of the structure of text, corpus. This context vector space was used for Computational Linguistics 17(1): pp.21-43. clustering words with similar meaning. In [10] J. Goldstein, M. Kantrowitz, V. Mittal, and comparison with the WordNet, context vector J. Carbonell, 1999, Summarizing Text space can be easily extended and constructed. Documents: Sentence Selection and Using topic keywords identification, a document Evaluation Metrics, In Proceedings of with no title and non-news genre can be ACM-SIGIR’ 99 : pp.121-128 summarized by DOCUSUM with significant [11] K. Han, D. Baek, and H. Rim, 2000, performance. Automatic Text Summarization Based on We plan to research on the following problems. Relevance Feedback with Query Splitting, First of all, to improve the calculation of Proceedings of the Fifth International semantic distance between words, we have plans Workshop on Information Retrieval with to study other context vector space models. Asian Languages, pp.201-202. Furthermore, we will study how to generate [12] G. Salton, 1989, Automatic : abstracts after extracting topic-related sentences The Transformation, Analysis, and Retrieval as a summary. of Information by Computer, Addison-Wesley Publishing Company References [13] T. H. Kim, H. R. Park, J. H. Shin, 1999, “Research on Text Understanding Model for [1] I. Mani, Automatic Summarization, 2001. IR/Summarization/Filtering”, The Third John Benjamins Publishing Co. pp.1-22. Workshop on Software Science, Seoul, [2] H. P. Luhn, The Automatic Creation of Korea. Literature Abstracts, 1959, IBM Journal of [14] M. Utiyama and K. Hasida, 2000, Research and Development, pp.159-165. Multi-Topic Multi-Document Summariza [3] H. P. Edmundson, 1968, New Methods in –tion, In Processings of the 18th Automatic Extraction, Journal of the ACM International Conference on Computational 16(2): pp.264-285. Linguistics, pp. 892-898. [4] C. Aone, M. E. Okurowski, J. Gorlinsky, B. [15] M. Wasson, 1998, Using leading text for Larsen, 1997, A Scalable Summarization news summaries: Evaluation results and System using Robust NLP, In Proceedings implications for commercial summarization of the Workshop on Intelligent Scalable Text applications, In Proceedings of the 17th Summarization, ACL/EACL’ 97, pp.66-73. International Conference on Computational [5] R. Barzilay and M. Elhadad, 1999, Using Linguistics and 36th Annual Meeting of the Lexical Chains for Text Summarization, ACL, pp.1364-1368 Advances in Automatic Summarization, The [16] K. Zechner, 1997, Fast generation of MIT Press: pp.111-121. abstracts from general domain text corpora [6] G. Miller, R. Beckwith, C. Fellbaum, D. by extracting relevant sentences, In Gross, and K. Miller, 1990, Introduction to Proceedings of the 16th international WordNet: An on-line lexical database. Conference on Computational Linguistics, International Journal of Lexicography pp.986-989 (special issue) 3(4): pp.234-245. [17] J. Carbonell and J. Goldstein, 1998, The Use [7] E. Hovy and C. Y. Lin, Automated Text of MMR, Diversity-Based Reranking for Summarization in SUMMARIST, 1997, In Reordering Documents and Producing Proceedings of the Workshop on Intelligent Summaries, In Proceedings of the 21th Text Summarization, ACL/EACL ’ 97. ACM-SIGIR International Conference on [8] C. Burgess and K. Lund, 1997, Modeling Research and Development in Information cerebral asymmetries of semantic memory Retrieval.