IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 15, NO. 1, FEBRUARY 2007 121 Fuzzy Rough Sets: The Forgotten Step Martine De Cock, Chris Cornelis, and Etienne E. Kerre

Abstract—Traditional rough uses equivalence rela- After a public debate reflecting rivalry between rough set tions to compute lower and upper approximations of sets. The theory and the slightly older theory, many researchers corresponding equivalence classes either coincide or are disjoint. started working towards a hybrid theory (e.g., [11], [18], [20], This behaviour is lost when moving on to a fuzzy T-equivalence relation. However, none of the existing studies on fuzzy rough set [25], [26], [28], and [30]). In doing so, the central focus moved theory tries to exploit the fact that an element can belong to some from elements’ indistinguishability (objects are indistinguish- degree to several “soft similarity classes” at the same time. In able or not) to their similarity (objects are similar to a certain de- this paper we show that taking this truly fuzzy characteristic into gree), represented by a fuzzy relation . As a result, objects are account may lead to new and interesting definitions of lower and categorized into classes with “soft” boundaries based on their upper approximations. We explore two of them in detail and we investigate under which conditions they differ from the commonly similarity to one another; abrupt transitions between classes are used definitions. Finally we show the possible practical relevance replaced by gradual ones, allowing that an element can belong of the newly introduced approximations for query refinement. (to varying degrees) to more than one class. Index Terms—Fuzzy rough set, lower and upper approximation, Soon, researchers started exploring possible applications of query refinement, transitivity. the new paradigm of fuzzy-rough hybridization. For a compre- hensive literature review up to 1999, we refer to [15]. Among the more recent work is that of Drwal [10] and Jensen and Shen I. INTRODUCTION [13], [14], who studied extensions of the well-known rough set INCE its introduction in the 1960s, fuzzy set theory has approaches to data reduction and classification. Shad a significant impact on the way we represent and com- In Section II, we recall the necessary background leading to pute with vague information. More recently it has become part the definition of a fuzzy rough set as presented in [26]. This of the larger paradigm of soft computing, a collection of tech- definition is an elegant fuzzification of the concept of a rough niques that are tolerant of typical characteristics of imperfect set and at the same time absorbs earlier suggestions in the same data and knowledge—such as vagueness, imprecision, uncer- direction. The most striking aspect of all the studies on fuzzy tainty, and partial truth—and hence adhere closer to the human rough set theory mentioned above however is that none of them mind than conventional hard computing techniques. During the tries to exploit the fact that an element of can belong to last decades new approaches have been developed that gener- some degree to several “soft similarity classes” at the same time. alize the original fuzzy set theory (which is also called type-1 This property does not only lie at the heart of fuzzy set theory but fuzzy set theory in this context). Type-2 fuzzy sets, intuition- is also crucial in the decision on how to define lower and upper istic fuzzy sets, interval-valued fuzzy sets and fuzzy rough sets approximations. For instance, in traditional rough set theory, have in common that they can all be formally characterized by belongs to the lower approximation of if the equivalence class membership functions taking values in a partially ordered set to which belongs is included in . But what happens if , which is no longer the same (but an extension of) the set of belongs to several “soft similarity classes” at the same time? membership degrees used in fuzzy set theory. The intro- Do we then require that all of them are included in ? Most of duction of such new, generalizing theories is often accompanied them? Or just one? And then, which one? In Sections III and by lengthy discussions on issues such as the choice of termi- IV, we continue this discussion touched upon for the first time nology and the added value of the generalization. in [5]. In this paper we focus on fuzzy rough set theory. Pawlak [23] As such it becomes clear that there is still significant room launched rough set theory as a framework for the construction for improvement and generalization of the definition of a fuzzy of approximations of concepts when only incomplete informa- rough set, beyond the most “obvious” fuzzification established tion is available. The available information consists of a set so far. Furthermore Section V reveals that this generalization of examples (a subset of a universe being a nonempty set is not just of theoretical interest but becomes crucial in a top- of objects we want to say something about) of a concept , ical application such as query refinement for searching on the and a relation in . models “indiscernibility” or “indistin- WWW. guishability” and therefore generally is a tolerance relation (i.e., a reflexive and symmetrical relation) and in most cases even an II. FROM ROUGH SETS TO FUZZY ROUGH SETS equivalence relation (i.e., a transitive tolerance relation). Rough set analysis makes statements about the membership of some element of to the concept of which is a set of ex- Manuscript received February 15, 2005; revised December 18, 2005, March amples, based on the indistinguishability between and the el- 20, 2006, and June 16, 2006. ements of . To arrive at such statements, is approximated in The authors are with the Department of Applied Mathematics and Computer two ways. An element of belongs to the lower approxima- Science, Ghent University, B-9000 Gent, Belgium (e-mail: Martine.De- [email protected]; [email protected]; [email protected]). tion of if the equivalence class to which belongs is included Digital Object Identifier 10.1109/TFUZZ.2006.889762 in . On the other hand belongs to the upper approximation

1063-6706/$25.00 © 2007 IEEE 122 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 15, NO. 1, FEBRUARY 2007

TABLE I WELL-KNOWN T-NORMS; x AND y IN ‘HY I“

TABLE II WELL-KNOWN S-IMPLICATORS; x AND y IN ‘HY I“

Fig. 1. Lower and upper approximation of a set e. The dark shaded area is the boundary region. TABLE III WELL-KNOWN RESIDUAL IMPLICATORS; x AND y IN ‘HY I“ of if its equivalence class has a non-empty intersection with . Formally, let be a universe and an equivalence relation. The lower and the upper approximation (in the sense of Pawlak [23]) of a subset of in the approximation space are the sets and such that for all in for all in . The fuzzy logical counterparts of the connectives (1) in (3) and (4) play an important role in this paper; we there- (2) fore recall some preliminaries in detail. Throughout this paper, let and denote a triangular norm and an implicator, respec- In other words tively. Recall that a triangular norm (t-norm for short) is any increasing, commutative and associative map- (3) ping satisfying , for all in . A negator is (4) a decreasing mapping satisfying and . is called involutive if for all in The underlying meaning is that is the set of elements nec- . Finally, an implicator is any -mapping essarily satisfying the concept (strong membership), while satisfying , for all in . More- is the set of elements possibly belonging to the concept (weak over we require to be decreasing in its first, and increasing in membership); for belongs to if all elements of in- its second component. If is a t-norm, the mapping defined distinguishable from belong to (hence, there is no doubt by, for all and in [0, 1], that also belongs to ), while belongs to as soon as an element of is indistinguishable from . It holds that .If belongs to the boundary region , and (5) then there is some doubt, because in this case is at the same time indistinguishable from at least one element of and at least is an implicator, usually called the residual implicator (of ). If one element of that is not in . This is illustrated graphically is a t-norm and is an involutive negator, then the mapping in Fig. 1. We call a rough set (in ) as soon as defined by, for all and in [0, 1], there is a set in such that and (see e.g., [26]). For completeness we mention that a second stream con- (6) cerning rough sets in the literature was initiated by Iwinski [12] is an implicator, usually called the S-implicator induced by who did not use an equivalence relation or tolerance relation and . In Tables I–III, we mention some well known t-norms, as an initial building block to define the rough set concept. S- and residual implicators. The S-implicators in Table II are Although his formulation provides an elegant mathematical induced by means of the standard negator which is defined model, the absence of the equivalence relation makes his model, by , for all in . as well as the fuzzy rough set theoretical models developed in Because equivalence relations are used to model equality, the same spirit (see, e.g., [21]), hard to interpret. Therefore, we fuzzy -equivalence relations are commonly considered to rep- do not deal with it in this paper. resent approximate equality or similarity. Recall that a fuzzy re- In the context of fuzzy rough set theory, is a fuzzy set in lation in is called a fuzzy -equivalence relation iff for all , i.e., an mapping, while is a fuzzy relation in and in , i.e., a fuzzy set in . Recall that for all in , the -foreset of is the fuzzy set defined by reflexivity symmetry transitivity DE COCK et al.: FUZZY ROUGH SETS: THE FORGOTTEN STEP 123

Paraphrasing statements (3) and (4) and absorbing earlier suggestions in the same direction, the following definition of the lower and upper approximation of a fuzzy set in was given in [26], constructed by means of an implicator , a t-norm and a fuzzy -equivalence relation in

(7) (8) for all in . is called a fuzzy rough set (in ) as soon as there is a fuzzy set in such that and . Formulas (7) and (8) for and can Fig. 2. Fuzzy similarity classes. also be interpreted as the degree of inclusion of in and the degree of overlap of and respectively, which indicates the semantical link with (1) and (2). of a pseudo-metric [4]. Let the fuzzy -equivalence relation in be defined by III. THE FORGOTTEN STEP

A. New Approximations for all and in . Fig. 2 depicts the -foresets of 1.3, 2.2, 3.1, The role of the equivalence class in the crisp case [see and 4. The -foresets of 3.1 and 4 are clearly different. Still one (1) and (2)] is subsumed by the more general concept of the can easily see that -foreset in the fuzzy case [see (7) and (8)]. It is well known that in the crisp case, if we consider two equivalence classes then they either coincide or are disjoint. It is therefore not pos- sible for to belong to two different equivalence classes at the same time. If is a fuzzy relation in —in particular a fuzzy -equivalence relation—then it is quite normal that, because Since belongs to degree 0.1 to the of the intermediate degrees of membership, different foresets -intersection of the -foresets of 3.1 and 4, i.e., these are not necessarily disjoint, as the following examples illustrate. -foresets are not disjoint. Recall that two (fuzzy) sets are disjoint iff their intersection is From now on, whenever the fuzzy relation models approx- empty, and that the -intersection of fuzzy sets and in imate equality, we will call the “fuzzy similarity class” of is defined by . From the previous examples, it is clear that does not only belong to but can also belong to other, different fuzzy sim- ilarity classes to a certain degree. Recall that, by the definition used so far, belongs to the lower approximation of to the degree to which is included in [see (7)]. In view of the for all in . discussion above however it makes sense to consider also the Example 1: Let be an arbitrary t-norm. One can verify that other fuzzy similarity classes to which has a non-zero mem- for the fuzzy -equivalence relation on given by bership degree, and to assess their inclusion into as well for the lower approximation, and their overlap with for the upper approximation. Informally, this immediately results in the fol- lowing (inexhaustive) list of candidate definitions for the lower and the upper approximation of . it holds that 1) belongs to the lower approximation of to the degree to which (9) a) all fuzzy similarity classes containing are included (10) in ; b) at least one fuzzy similarity class containing is in- hence, for any t-norm cluded in ; c) is included in . 2) belongs to the upper approximation of to the degree to (11) which a) all fuzzy similarity classes containing have a The latter shows that the foresets and are not disjoint, nonempty intersection with ; since their intersection contains both and to degree 0.2. b) at least one fuzzy similarity class containing has a Example 2: In applications is often used as a t-norm be- nonempty intersection with ; cause the notion of fuzzy -equivalence relation is dual to that c) has a nonempty intersection with . 124 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 15, NO. 1, FEBRUARY 2007

Paraphrasing these expressions, we obtain the following The tight and loose approximations should exhibit a similar be- definitions. haviour. To show that this is indeed the case, we recall that the Definition 3: Let be a fuzzy relation in and a fuzzy lower and the upper approximation are monotonic operations set in . due to the monotonicity of the fuzzy logical operators involved. 1) The tight, loose and (usual) lower approximation of are This is reflected in the next proposition. defined as Proposition 6: [26]: For every fuzzy set and in a) ; (17) b) (18) ; c) ; Combining this with Proposition 5 we conclude that the tight for all in . lower and the loose upper approximation are indeed a subset of 2) The tight, loose and (usual) upper approximation of are and a superset of , respectively, which justifies the defined as terminology. a) Proposition 7: For every fuzzy set in ; b) (19) ; c) ; Note that in [6] it is suggested to use and for all in . as representations of the modified linguistic expressions The terminology “tight” refers to the fact that we take “all” fuzzy extremely very more or less , and roughly respec- similarity classes into account, giving rise to a strict or tight re- tively (for being a fuzzy relation modelling approximate quirement. For the “loose” approximations, we only look at “the equality). best one” which is clearly a more flexible demand. Options (1c) From Propositions 4–6, we obtain and (2c) correspond to the well known definition from the lit- erature on fuzzy rough set theory [see (7) and (8)], while (1b) (20) and (2a) correspond to the generalized opening and closure op- (21) erators defined in [1]. Furthermore in the crisp case options (1a) through (1c) coincide, as well as options (2a) through (2c) be- but no immediate information about a direct relationship be- cause then there is exactly one equivalence class to which be- tween the loose lower and the tight upper approximation in longs, namely . As mentioned previously, however, different terms of inclusion, and about how itself fits in this picture. fuzzy similarity classes are not necessarily disjoint, hence, a fur- The following proposition sheds some light on this matter. ther investigation on the relationships between the tight, loose Proposition 8: [1]: If is a left continuous t-norm and its and usual approximations is in order. residual implicator then for every fuzzy set in In the remainder of this paper we will assume that is a re- flexive and symmetrical fuzzy relation in , which are basic (22) requirements if is supposed to model similarity. Some prop- erties require additional -transitivity of ; whenever this is the Proposition 8 does not hold in general for other choices of case we mention it explicitly. t-norms and implicators that do not fulfill the properties

(23) B. Links Between the Approximations (24) We start with the following proposition which follows imme- diately from the definitions due to the symmetry of and proves as Example 9 illustrates. to be very useful. Example 9: Let and be defined as in Example 1 and let Proposition 4: For every fuzzy set in be the fuzzy set in defined as and . Furthermore let and be its S-implicator. (12) Then, and , hence (13) (25) (14) (15) which makes it clear that . From all of the above, we obtain The following proposition supports the idea of approximating a concept from the lower and the upper side. Proposition 5: [26]: For every fuzzy set in provided that is left continuous and is its residual impli- cator. We stress that this holds for any reflexive and symmetric (16) fuzzy relation . This justifies the name “lower” and “upper” DE COCK et al.: FUZZY ROUGH SETS: THE FORGOTTEN STEP 125 for the new tight and loose approximations. Furthermore the fol- the round composition of fuzzy relations and in is the lowing example illustrates that the new approximations indeed fuzzy relation in defined by differ from the existing ones. Example 10: Let and the fuzzy set in defined as , for all in . Let the reflexive and symmetrical (27) fuzzy relation in be defined as for all and in . if Proposition 12: If is a left continuous t-norm then for every otherwise fuzzy set in for all and in . One can verify that for in (28)

Proof: For all in

Hence, . Furthermore

hence . We verify

Proposition 13: If is left continuous in its first component and and right continuous in its second component, and if and satisfy the shunting principle

(29)

then for every fuzzy set in

In the same way, one can verify that and . This illustrates that all the approxima- (30) tions from Definition 3 are different. The proof of Proposition 13 is analogous to the proof of C. Maximal Expansion and Reduction Proposition 12. Regarding the restrictions placed on the fuzzy Taking an upper approximation of in practice corresponds logical operators involved, recall that the shunting principle to expanding , while a lower approximation is meant to re- is satisfied both by a left continuous t-norm and its residual duce . However this refining process does not go on forever. implicator [22] as well as by a t-norm and an S-implicator The following property says that with the loose lower and the induced by it [26]. tight upper approximation maximal reduction and expansion is Let us use the following notation, for : achieved within one approximation phase. Proposition 11: [1]: If is a left continuous t-norm and its residual implicator, then for every fuzzy set in From Proposition 12 it follows that taking times successively and (26) the upper approximation of a fuzzy set under corresponds to taking the upper approximation once under the composed fuzzy To investigate the behaviour of the loose upper and tight lower relation . Proposition 13 states a similar result for the lower approximation w.r.t. expansion and reduction, we first establish approximation. Recall the following important proposition that links with the composition of with itself. Recall that in general holds for reflexive and -transitive fuzzy relations. 126 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 15, NO. 1, FEBRUARY 2007

Proposition 14: [26]: If is a fuzzy -equivalence relation Cattaneo introduces the mappings and in , then defined by

necessity In other words, using a -transitive fuzzy relation , options interior (1a) and (1c) of Definition 3 coincide, as well as options (2b) and closure (2c). The following property shows that under these conditions, possibility they also coincide with (1b), respectively, (2a). Proposition 15: [1], [2], [26]: If is a fuzzy -equivalence for all in . Applying the law of contraposition ( relation in is a left continuous t-norm and its residual if and only if ) to (33) it is easy to see that implicator then for every fuzzy set in

and (31) (35)

This means that, using a fuzzy -equivalence relation to model Now, for every crisp relation and every crisp set approximate equality, we will obtain maximal reduction or ex- and holds. This pansion in one phase, regardless of which of the approximations allows us to derive the following: from Definition 3 is used. As Example 10 illustrates, this effect is not always exhibited by non- -transitive fuzzy relations. (36) When is not -transitive and the universe is finite, it and is known that the -transitive closure of is given by (37) (assuming ) [19], hence (38) (39) (40) In other words with the lower and upper approximation, max- imal reduction and expansion will be reached in at most In [3] Cattaneo himself gives the full expressions of the crisp steps, while with the tight lower and the loose upper approxi- counterparts of Definition 3, parts 1(c) and 2(c) for and mation it can take at most steps. , respectively. In [15] and are linked to expressions corre- IV. RELATED WORK AND FURTHER COMMENTS sponding to the crisp counterparts of Definition 3, parts 1(b) and 2(a), respectively, i.e., what we call tight lower approxima- Although—to our knowledge—the tight and loose lower and tion and loose upper approximation. In the crisp case, it makes upper approximations have never been considered in the frame- sense to differentiate between and , and between work of fuzzy rough set theory, their crisp counterparts have al- and if one is dealing with a tolerance relation ready surfaced in classical rough set theory, albeit from different which is not an equivalence relation. In [15] it is also suggested angles of interpretation. The first one is due to [3]. His approach to work with “tolerance classes of some iterations of tolerance to rough sets is remarkably different from others because it does relations,” most likely referring to the composition of relations. not revolve around a notion of indistinguishability or similarity, In general in the crisp case the difference between using a tol- but around a dual notion of discernibility. This discernibility is erance relation and an equivalence relation is clear at first sight, represented by a so-called preclusivity relation, which is an ir- because with the former different similarity classes need not to reflexive and symmetrical relation. It can be obtained as the set be disjoint, while with the latter equivalence classes are either theoretical complement of an equivalence relation, or more equal or disjoint. This implies that when using an equivalence generally of that of a tolerance relation . Apart from the usual relation, it makes no sense to differentiate between “all classes set-theoretical complement of a set ,defined by containing ” and “at least one class containing ” because there is exactly one class containing , namely . Hence, the tight, (32) the usual and the loose upper approximation coincide, and so do the lower approximations. for all in , Cattaneo also defines the preclusive orthocom- Fuzzy -equivalence classes are known as the fuzzy coun- plement of : terpart of equivalence relations, but as we illustrated at the be- ginning of the previous section, they do no longer satisfy the (33) property that fuzzy similarity classes are equal or disjoint; in is the set of elements that are discernible from all ele- fact can at the same time belong to different fuzzy similarity ments of . Using also classes to a certain degree. Hence it is not possible at first sight to rule out the usefulness of the tight and loose lower and upper approximations introduced in Definition 3 as alternatives to the (34) existing lower and upper approximation for fuzzy rough sets. DE COCK et al.: FUZZY ROUGH SETS: THE FORGOTTEN STEP 127

Fig. 3. Fuzzy thesaurus.

However careful investigation of the properties of the newly overview of approaches to automatically construct such fuzzy defined approximations show that interplay between suitably thesauri based on the co-occurrences of terms in documents is chosen fuzzy logical operators and the -transitivity of the presented in [9]. fuzzy relation forces the new approximations to coincide with Let us think of a query as a fuzzy set of keywords. If the user the existing ones anyway. In the next section we will illustrate cares to give weights to indicate the importance of the individual that this is not always a desirable property in applications, keywords, these will be taken into account as the membership because it does not allow for gradual expansion or reduction of of the terms in the query. However we expect that in many cases a fuzzy set by iteratively taking approximations. Omitting the the user does not want to be bothered with the need to specify his requirement of -transitivity is precisely the key that allows query in such detail. In these cases 1 will be used as the default for a gradual expansion process. membership degree of a term appearing in the query. One of the Other undesirable effects of -transitivity w.r.t. approximate advantages of our query refinement approach is that these initial equality were pointed out in [7], [8]. More in particular it is ob- weights will be gradually adapted during the search process. served there that fuzzy -equivalence relations can never satisfy Formally, let denote the universe of terms and let the orig- the so-called Poincaré paradox. A fuzzy relation in is com- inal query be a fuzzy set in . Furthermore let be a fuzzy patible with the Poincaré paradox iff thesaurus, then query can be expanded by taking its upper ap- proximation under . In [17] a formally similar idea is promoted to expand the fuzzy set of terms associated with a document (in- This is inspired by Poincaré’s [16] experimental observation stead of a query); there it is also suggested to use the transitive that a bag of sugar of 10 grammes and a bag of 11 grammes can closure of . In [27] the connection between query expansion be perceived as indistinguishable by a human being. The same and fuzzy rough sets is established. applies for a bag of 11 grammes w.r.t. a bag of 12 grammes, In this section we will show that using a -transitive fuzzy while the subject is perfectly capable of noting a difference be- thesaurus may result in adding too many irrelevant keywords. tween the bags of 10 and 12 grammes. Now if is a fuzzy However even with a non -transitive fuzzy thesaurus we can -equivalence relation, then implies easily run into the same problem when using the upper or the [11]. Since , also loose upper approximation. We therefore promote the use of the which is in conflict with . The fact that they are not newly introduced tight upper approximation. compatible with the Poincaré paradox makes fuzzy -equiva- To illustrate our point, we constructed the small fuzzy the- lence relations less suited to model approximate equality. The saurus shown in Fig. 3 by taking into account the number of main underlying cause for this conflict is -transitivity. web pages found by a search engine for each pair of terms, as shown in Fig. 4. Let and denote the number of web V. Q UERY REFINEMENT pages that contain term , respectively term ; these numbers can be found on the diagonal in Fig. 4. On the WWW there is One of the most common ways to retrieve information from a strong bias towards related terms, hence the the WWW is keyword based search: the user inputs a query con- absolute number of web pages containing both term and sisting of one or more keywords and the search system returns a cannot be used directly to express the strength of the relation- list of web documents ranked according to their relevance to the ship between and . To level out the difference, we used the query. The same procedure is often used in e-commerce appli- following measure: cations that attempt to relate the user’s query to products from the catalogue of some company. In the basic approach documents are not returned as search results if they do not contain (one of) the exact keywords of the query. To satisfy the user who expects search engines to come up with “what they mean and not what they say”, more sophis- as shown in Fig. 5. Finally we normalized the result using the ticated techniques are needed. One option are so-called query S-function (cfr. Fig. 6), giving rise to the fuzzy refinement techniques that adapt the original query by adding thesaurus of Fig. 3. To compute its transitive closure, depicted in terms that are related to the initial keywords. This requires the Fig. 7, we used t-norm as we will do throughout this section. use of a fuzzy term-term relation, called a fuzzy thesaurus. An Furthermore we will keep on using because it is at the same 128 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 15, NO. 1, FEBRUARY 2007

Fig. 4. Number of thousands of web pages found by Google.

Fig. 5. Unnormalized fuzzy thesaurus. time a residual and an S-implicator. Now, let us consider the query

as shown in the 2nd column in Fig. 8. The meaning of the am- biguous word “apple,” which can refer both to a piece of fruit and to a computer company, is clear in this query. The disadvan- tage of using a -transitive fuzzy thesaurus becomes apparent when we compute the upper approximation , shown in the fifth column. All the terms are added with high degrees, even though terms like “mac” and “computer” have nothing to do with the semantics of the original query. This process can be slowed down a little bit by using the non -transitive fuzzy the- saurus and computing which allows for some gradual re- finement. However, an irrelevant term such as “emulator” shows xY Y  Y` up to a high degree in the second iteration, i.e., when computing Fig. 6. S-function; and in . . The main problem with the query expansion process, even if it is gradual, is a fast growth of the number of less relevant or Fig. 8 shows that the tight upper approximation performs clearly irrelevant keywords that are automatically added. This effect is better: irrelevant words such as “mac,”“computer,” and “hard- caused by the use of a flexible definition of the upper approx- ware” are still added to the query, but to a significantly lower imations in which a term is added to a query as soon as it is degree. related to one of its keywords. Therefore, we suggest the use of the tight upper approximation : a term will only be added VI. CONCLUSION to a query if all the terms that are related to are also related to Exploiting the truly fuzzy characteristic that an element can at least one keyword of the query. First, the usual upper approx- belong to some degree to different fuzzy similarity classes at imation of the query is computed but then it is stripped down the same time, we have defined the tight lower approximation by omitting all terms that are also related to other terms not be- , the loose lower approximation , the tight upper longing to this upper approximation. In this way terms that are approximation and the loose upper approximation sufficiently relevant, hence related to most keywords in , will of a fuzzy set under a reflexive and symmetrical fuzzy relation form a more or less closed context with few or no links outside, . For any left continuous t-norm and its residual implicator while a term related to only one of the keywords in in general it holds that also has many links to other terms outside and hence is omitted by taking the lower approximation. The last column of DE COCK et al.: FUZZY ROUGH SETS: THE FORGOTTEN STEP 129

Fig. 7. Transitive closure of fuzzy thesaurus.

[9] M. De Cock, S. Guadarrama, and M. Nikravesh, “Fuzzy thesauri for and from the WWW,” in Soft Computing for Information Processing and Analysis, M. Nikravesh, L. A. Zadeh, and J. Kacprzyk, Eds. New York: Springer-Verlag, 2005, vol. 164, Studies in Fuzziness and Soft Computing. [10] G. Drwal, “Rough and fuzzy-rough classification methods imple- mented in RClass system,” in Proc. 2nd Int. Conf. Rough Sets Current Trends in Computing, 2000, pp. 152–159. [11] D. Dubois and H. Prade, “Rough fuzzy sets and fuzzy rough sets,” Int. J. Gen. Syst., vol. 17, pp. 191–209, 1990. [12] T. B. Iwinski, “Algebraic approach to rough sets,” Bull. Polish Acad. Sci. Math., vol. 35, pp. 673–683, 1987. Fig. 8. Original and expanded queries. [13] R. Jensen and Q. Shen, “Fuzzy-rough attribute reduction with appli- cation to web categorization,” Fuzzy Sets Syst., vol. 141, no. 3, pp. 469–485, 2004. [14] R. Jensen and Q. Shen, “Fuzzy-rough data reduction with ant colony which justifies the names of the new approximations. When is optimization,” Fuzzy Sets and Systems, vol. 149, no. 1, pp. 5–20, 2005. in addition -transitive, the new approximations coincide with [15] J. Komorowski, Z. Pawlak, L. Polkowski, and A. Skowron, “Rough the existing ones. However when is not -transitive, the ap- sets: A tutorial,” in Rough Fuzzy Hybridization: A New Trend in De- cision-Making, S. K. Pal and A. Skowron, Eds. Singapore: Springer- proximations can be different and allow for a gradual expansion Verlag, 1999. process. We have shown the practical relevance of our results [16] H. Poincaré, La Science et l’Hypothèse. Paris, France: Flammarion, for query refinement. 1902. [17] S. Miyamoto, “Two approaches for information retrieval through fuzzy associations,” IEEE Trans. Syst., Man, Cybern., vol. 19, no. 1, pp. 123–130, Jan. 1989. ACKNOWLEDGMENT [18] N. N. Morsi and M. M. Yakout, “Axiomatics for Fuzzy Rough Sets,” Fuzzy Sets Syst., vol. 100, no. 1–3, pp. 327–342, 1998. The second author would like to thank the Research Founda- [19] H. Naessens, H. De Meyer, and B. De Baets, “Algorithms for the com- putation of „ -transitive closures,” IEEE Trans. Fuzzy Syst., vol. 10, no. tion-Flanders for funding his research. 4, pp. 541–551, Aug. 2002. [20] A. Nakamura, “Fuzzy rough sets,” Note Multiple-Valued Logic Japan, vol. 9, pp. 1–8, 1988. REFERENCES [21] S. Nanda and S. Majumdar, “Fuzzy rough sets,” Fuzzy Sets Syst., vol. 45, pp. 157–160, 1992. [1] U. Bodenhofer, “A unified framework of opening and closure opera- [22] V. Novák, I. Perfilieva, and J. Mockoˇ ˇr, Mathematical Principles of tors with respect to arbitrary fuzzy relations,” Soft Comput., vol. 7, pp. . Norwell, MA: Kluwer, 1999. 220–227, 2003. [23] Z. Pawlak, “Rough sets,” Int. J. Comput. Inform. Sci., vol. 11, no. 5, [2] U. Bodenhofer, M. De Cock, and E. E. Kerre, “Openings and closures pp. 341–356, 1982. of fuzzy preorderings: Theoretical basics and applications to fuzzy [24] Z. Pawlak, “Rough sets and fuzzy sets,” Fuzzy Sets Syst., vol. 17, pp. rule-based systems,” Int. J. Gen. Syst., vol. 32, no. 4, pp. 343–360, 99–102, 1985. 2003. [25] K. Qin and Z. Pei, “On the topological properties of fuzzy rough sets,” [3] G. Cattaneo, “Abstract approximation spaces for rough theories,” in Fuzzy Sets Syst., vol. 151, no. 3, pp. 601–613, 2005. Rough Sets in Knowledge Discovery 1: Methodology and Applications, [26] A. M. Radzikowska and E. E. Kerre, “A comparative study of fuzzy L. Polkowski and A. Skowron, Eds. Heidelberg, Germany: Physica- rough sets,” Fuzzy Sets Syst., vol. 126, pp. 137–156, 2002. Verlag, 1998, pp. 59–98. [27] P. Srinivasan, M. E. Ruiz, D. H. Kraft, and J. Chen, “Vocabulary mining [4] B. De Baets and R. Mesiar, “Pseudo-metrics and „ -equivalences,” J. for information retrieval: Rough sets and fuzzy sets,” Inform. Process. Fuzzy Math., vol. 5, no. 2, pp. 471–481, 1997. Manage., vol. 37, pp. 15–38. [5] M. De Cock, C. Cornelis, and E. E. Kerre, “Fuzzy rough sets: Beyond [28] H. Thiele, Fuzzy rough sets versus rough fuzzy sets—An interpretation the obvious,” in Proc. FUZZ-IEEE2004, 2004, vol. 1, pp. 103–108. and a comparative study using concepts of modal logic Univ. Dort- [6] M. De Cock and E. E. Kerre, “A context based approach to linguistic mund, Dortmund, Germany, Tech. Rep. ISSN 1433-3325, 1998. hedges,” Int. J. Appl. Math. Comput. Sci., vol. 12, no. 3, pp. 371–382, [29] H. Thiele, “Generalizing the explicit concept of rough set on the basis 2002. of modal logic,” in Computational Intelligence in Theory and Practice, [7] M. De Cock and E. E. Kerre, “On (un)suitable fuzzy relations to model B. Reusch and K. H. Temme, Eds. Heidelberg, Germany: Physica- approximate equality,” Fuzzy Sets Syst., vol. 133, no. 2, pp. 137–153, Verlag, 2001. 2003. [30] Y. Y. Yao, “Combination of rough and fuzzy sets based on alpha-level [8] M. De Cock and E. E. Kerre, “Why fuzzy „ -equivalence relations do sets,” in Rough Sets and : Analysis for Imprecise Data, not resolve the Poincaré paradox, and related issues,” Fuzzy Sets Syst., T. Y. Lim and N. Cercone, Eds. Norwell, MA: Kluwer, 1997, pp. vol. 133, no. 2, pp. 181–192, 2003. 301–321. 130 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 15, NO. 1, FEBRUARY 2007

Martine De Cock received the M.Sc. and Ph.D. Etienne E. Kerre received the M.Sc. and Ph.D. de- degrees in computer science from Ghent University, grees in mathematics, both from Ghent University, Gent, Belgium, in 1998 and 2002, respectively. Gent, Belgium, in 1967 and 1970, respectively. From 1998 to 2005, her activities as a Research Since 1984, he has been a Lector, and since 1991, Assistant and Postdoctoral Fellow were supported by a Full Professor at Ghent University. In 1976, he the Research Foundation–Flanders. She worked as a founded the Fuzziness and Modelling visiting scholar in the BISC group at the University Research Unit (FUM) and since then his research of California, Berkeley, and the Knowledge Sys- has been focused on the modelling of fuzziness and tems Laboratory at Stanford University, Stanford, uncertainty, and has resulted in a great number of CA. Since 2005, she has been a Professor in the contributions in fuzzy set theory and its various Department of Applied Mathematics and Computer generalizations. The theories of fuzzy relational Science. Her current research efforts are directed towards the development calculus and of fuzzy mathematical structures owe a great deal to him. Over and the use of computational intelligent methods for next generation web the years he has also been a supervisor of 20 Ph.D.s on fuzzy set theory. His applications. current research interests include fuzzy and intuitionistic fuzzy relations, fuzzy topology, and fuzzy image processing. He has authored or coauthored eleven books and more than 350 papers appeared in international refereed journals and proceedings. Chris Cornelis received the Ph.D. in computer sci- Dr. Kerre is a referee for 50 international scientific journals, and also ence from Ghent University, Gent, Belgium, with a a member of the editorial board of international journals and conferences thesis on “Two-Sidedness in the Representation and on fuzzy set theory. He was an honorary chairman at various international Processing of Imprecise Information,” in 2004. conferences. Currently, he is a Postdoctoral Fellow funded by the Research Foundation–Flanders. His research interests include various models of imprecision (fuzzy rough sets, bilattices, interval-valued fuzzy sets, v-fuzzy sets), data mining, and approximate reasoning. He is currently focusing on their appli- cation to personalized information access and web intelligence.