<<

arXiv:1401.0494v1 [cs.DB] 2 Jan 2014 xed h Q yalwn h srt osrc queries construct to atom satisfaction user Each It combines sets. condition the fuzzy [4]. by Bases) allowing defined conditions by atomic description regarding (Fuzzy SQL the FDB the for the success extends of big manipulation a the has and SQLf, called querying, rpsdt ov hspolmadt otiuei in contribute be to have and solutions problem Several this last. solve the to of proposed exploitation better a iha pe threshold th upper satisfy that an data with (the calibration qualitative or query) rmflxbeqeyn 1,[] [3]. [2], use [1], their profit querying to flexible and data fuzzy manage to need applications model the literature. of in extensions available are several which s and with deal partially information to applications. model often imprecise classical world the real is in many introduced Data is in Fuzziness ambiguous manipulated. or and information vague known, imprecise represented and be uncertain allowing to by relati classical model the data generalize that (FDB) Fuzzy umrzto,SL,Tpkqey uz FCA. Fuzzy query, k Top SQLf, summarization, uniaieclbain(the calibration quantitative a bet eedn ntefzylnusi aesseie a specified labels linguistic fuzzy description the produced a criteria. on summaries fuzzy provides selection It on fuzzy depending approach. based the objects Fuzzy-SaintEtiq query use our SQLf The We by Flexible researchers. fuzzy summaries. huge community with flexible DB linguistic vi deals a synthetic many with work with with for work present deal However, challenge to necessity a to (FDB). the became data, language DataBases fuzzy of practical Fuzzy amount imprecise the in such of mode with querying classical one deal the To in is introduced applications. is world fuzziness information, real many in oee,temsiedt ece oa aenecessary make today reached data massive the However, ntels eae,arltoa aaaelnug o fuzz for language database relational a decades, last the In situation the with more and more confronted are We to attracted been has attention of lot a years, recent In nadto,Sl iistenme fasesb using by answers of number the limits SQlf addition, In ewrsFzyrltoa aaae,Feil querying, Flexible databases, relational Keywords-Fuzzy Abstract lxbeSL ur ae nfzylinguistic fuzzy on based query SQLf Flexible Dt sotnprilykon au rambiguous or vague known, partially often is —Data .I I. α NTRODUCTION aoaor ’nomtqe rgamto,Algorithmiq Programmation, d’Informatique, Laboratoire ). nsBenali-Sougui Ines 1 k [email protected], µ aoaor inl mgse ehooisd l’informati de et Images Signal, Laboratoire etrsosso h top the or responses best ∈ [0 , 1] oa trbt value. attribute an to apsUiestie ui 00 Tunisia 1060, Tunis Universitaire, Campus P 7 eBl`dee10,Tns Tunisia Tunis, Belv`ed`ere 1002, Le 37, BP. 1 , 2 cl ainl ’n´ner eTunis de d’Ing´enieurs Nationale Ecole 3 1 autedsSine eTunis de Sciences des Facult´e , 2 1 , summaries iyrSassi-Hidri Minyar , 3 nvri´ ui lManar El Universit´e Tunis query e .SQLf l. { Data 2 onal uch ews minyar.sassi, en of ic rs k y s hc noprtsfzylgcit C orpeetvague represent to FCA into logic information. fuzzy incorporates which FCA Fuzzy A. ( nFzySittqapoc.Scin5cnldstepaper work. the future concludes some 5 gives based Section and query 4 SQLf approach. flexible Section Fuzzy-SaintEtiq the overview [8]. on for an in approach new presents presented our approach theoretical 3 describes Fuzzy-SaintEtiq the Section and our queries. FCA of fuzzy fuzzy of of concept modeling basic the presents process, summarization the The in [12]. query. used querying SQLf flexible [11] allows sets flexible fuzzy under of theory fuzzy-SaintEtiq of use. their maries of So, problem summaries. the of to number faced high are a we predict to idea . difficult rough given becomes a a get in rapidly tuples to of properties possible is generate the it summaries of to makes it order querying since in crucial Furthermore, lattice post hierarchy. called concept conceptual step, fuzzy second as uses The navigation processing, [7]. DB in linguis in to proposed much compared risks summarization as expert domain optimization the of of degrees. minimization form memberships’ of a means recor is by DB This clusters the many associating to tuples permits or data that the p fuzzy clustering called support of fuzzy first, to generation the a [10] steps: considers major step, model two processing SaintEtiQ of consists the data.It of fuzzy approach extension This [9]. an (FCA) Analysis is Concept Formal of whi core lattice the concept using Fuzzy-SaintEtiq called [7]. approach [6], been [5], have problem that this surround ones to are proposed approaches Formal summarization. ,M I M, G, ento 1 Definition [13] by proposed FCA Fuzzy the discuss we section, this In h eto h ae sognzda olw:scin2 section follows: as organized is paper the of rest The sum- hierarchical the exploit to propose it we work, simple, this is In summary a of interpretation the that Even summarization linguistic fuzzy a proposed have we [8], In 3 amel.touzi 2 mlGrissa-Touzi Amel , = ϕ ( G uz omlcneti triple a is context formal fuzzy A . } × ee Heuristiques et ue @enit.rnu.tn M I P II. )) RELIMINARIES where on 3 G sasto objects, of a is K M f his ch the re- tic ds = is a set of attributes, and I is a on domain G × M. lattice generated from the fuzzy formal context given in Each relation (g,m) ∈ I has a membership value µ(g,m) in I. [0, 1]. A fuzzy formal context can also be represented as a cross- table as shown in Table I. The context has three objects representing three documents, namely D1, D2 and D3. In addition, it also has three attributes, Data Mining (D), Cluster- ing (C) and (F) representing three researchable topics. The relationship between an object and an attribute is represented by a membership value between 0 and 1. A confidence threshold T can be set to eliminate relations that have low membership values [13]. Table I shows the cross- Fig. 1. (a) A concept lattice generated from traditional FCA. (b) A fuzzy table of the fuzzy formal context with T =0.5. concept lattice generated from Fuzzy FCA

TABLE I FUZZY FORMAL CONTEXT WITH T = 0.5 B. Fuzzy queries’ modeling D C F The SQLf [4] language extends the SQL language by D1 0.8 - 0.61 allowing the user to construct queries on atomic conditions D2 0.9 0.85 - D3 - - 0.87 defined by fuzzy sets. Each atomic condition combines a satisfaction level µ ∈ [0, 1] with an attribute value. For all the attributes of an n-uplet, the semantics of degrees are the Each relationship between the object and the attribute is same which involve that all criteria are commensurable. SQLf represented as a membership value in fuzzy formal context, query has the following syntax: then the intersection of these membership values should be Select [distinct] [k | α |k,α ] h attribut i the minimum of them, according to fuzzy theory [11]. From h strict relation i Definition 2. Fuzzy formal concept: Given a fuzzy formal Where h fuzzy condition i context Kf = (G, M, I = ϕ(G×M)) and a confidence thresh- where h fuzzy condition i can incorporate blocks of old T , we define A∗ = {m ∈ M|∀g ∈ A : µ(g,m) ≥ T } for queries nested or partitioned. Parameters k and α from Select A ⊆ G and B∗ = {g ∈ G|∀m ∈ B : µ(g,m) ≥ T } for clause limit the number of answers by using a quantitative B ⊆ M. calibration(k best responses) or qualitative calibration(the data A fuzzy formal concept (or fuzzy concept) of a fuzzy formal that satisfy the query with a threshold greater than α). context Kf = (G, M, I = ϕ(G × M)) with a confidence Example. Let be consider the employee DB relation pre- threshold T is a pair (Af = ϕ(A),B) where A ⊆ G, B ⊆ sented in table II and the following query: Finding employees M, A∗ = B and B∗ = A. Each object g = ϕ(A) has a about 40 years and a high income. membership µg defined as µg = min(µ(g,m)) and m ∈ B TABLE II where µ(g,m) is the membership value between an object g RELATIONSHIP EMPLOYEE and an attribute m, which is defined in I. Note that if B = {} Id Name Age Income then µg =1 for every g. 1 Pierre 38 2900 Definition 3. Let (A1,B1) and (A2,B2) be two fuzzy 2 Jean 37 2800 3 Yvette 42 2700 concepts of a fuzzy formal context (G, M, I). (ϕ(A1),B1) is the sub-concept of (ϕ(A2),B2), denoted as (ϕ(A1),B1) ≤ (ϕ(A2),B2), if and only if ϕ(A1) ⊆ ϕ(A2) (⇐⇒ B2 ⊆ B1). About 40 and high income are the selection criteria defined Equivalently, (A2,B2) is the super-concept of (A1,B1). by fuzzy sets. In particular, for the n-uplets of the relation Definition 4. A fuzzy concept lattice of a fuzzy formal employees we have used High income=0.9/2900, 0.8/2800, context K with a confidence threshold T is a set F (K) of 0.7/2700, About 40=0.8/38, 0.6/37, 0.8/42. all fuzzy concepts of K with the partial order ≤ with the Each n-uplet is associated with a vector that represents its confidence threshold T . position regarding the atomic conditions. Thus, the final result Definition 5. The similarity of a fuzzy formal concept K1 = is evaluated and calculated using a standard triangular (eg (ϕ(A1),B1) and its sub-concept K2 = (ϕ(A2),B2) is defined min) to express a conjunction between the criteria. We then as: obtain: Pierre(0.8) > Y vette(0.7) > Jean(0.6) means that the tuple 1 is preferred to tuple 3 which is preferred to tuple |ϕ(A1) ∩ ϕ(A2)| E(K1, K2)= (1) 2. |ϕ(A1) ∪ ϕ(A2)| With SQLf language, preferences are considered only as Figure 1 (a) gives the traditional concept lattice generated constraints and are taken into account through the expression from table I, in which crisp values Yes and No are used instead of fuzzy predicates commensurable, modeled using fuzzy sets of membership values. Figure 1 (b) gives the fuzzy concept of values more or less satisfactory. Consequently, the results are totally ordered. Such pred- called Fuzzy-FCM [26] has been proposed. It’s an extension icates can be combined using a rich platform operators of of FCM algorithm in order to support different types of data logical fuzzy, with some reflecting the effects of such com- represented by GEFRED model [1]. The second step, called pensation or relative importance between criteria, have no the post processing, takes into account the result of the fuzzy counterpart in Boolean logic. Unlike other approaches (such clustering on each attribute, visualizes by using the fuzzy as Preference SQL), data selection and order preferences concepts lattices. Then, it nests them in a fuzzy nested lattice. operation are processed simultaneously. Finally, it generalizes them in a fuzzy lattice associating all Besides, top-k query has attracted much interest in many records in a simple and hierarchical structure. Each lattice different areas such as network and system monitoring [14], node is a fuzzy concept which represents a concept summary. information retrieval [15], sensor networks[16], [17], large This structure defines summaries at various hierarchical levels. databases [18], multimedia databases [19], spatial data analysis For a more formal expression of a query, let be consider:

[20], [21], P2P systems [22], data stream management systems • A = {A1, A2, ..., Ak} is a set of attributes, th [23] etc. • Ak is the k attribute which appears is query Q, The main reason for such interest is that they avoid • (A) is the relation whose tuples are summarized, th overwhelming the user with large numbers of uninteresting • ti is the i tuple, i ∈ 1...N, answers which are resource-consuming. • Lk = {l1k, ..., ljk} is a set of linguistic terms of attribute The problem of answering top k queries can be modeled as Ak, follows [24]. Suppose that we have m lists of n data items such • µijk is a membership degree of tuple ti (record) to the that each data item has a local score in each list and the lists linguistic term ljk of attribute Ak, are sorted according to the local scores of their data items. And • RZf is the sub set of involved tuples in summary zm, each data item has an overall score which is computed based • IZf is the sub set of linguistic terms which appears in on its local scores in all lists using a given scoring function. summary zf , Then the problem is to find the k data items whose overall • Zf = (Rzf , IZf ) is the concept summary, Rzm and IZf scores are the highest. are respectively the intension and the extension of the concept, III. FUZZY DATA SUMMARIZATION • level is the level of the concept summary in the concept A. Overview of Fuzzy-SaintEtiq approach lattice, • In [8], we have proposed a fuzzy linguistic summarization |RZf | is the number of candidates tuples in RZf . approach Fuzzy-SaintEtiq which is based on FCA-based Sum- mary model [25]. It takes the database records and provides B. Illustrative example knowledge. Figure 2 gives the system architecture. Let consider a relation R =(IdTuple, Age, Income, ProfessionalBackground) from an FDB Employees table. Table III gives a sample of FDB employees table.

TABLE III DATA SAMPLE IdTuple Age Income ProfessionalBackground t1 [38,39,40] 950 10 t2 Adult 650 5 t3 Young 700 ± 20 3 t4 Adult Poor 20 t5 38 Poor 7 t6 40 ± 2 Comfortable 12

Each cluster of a partition is labeled by linguistic descriptor Fig. 2. The Overall process of Fuzzy-SaintEtiq provided by a domain expert. For example, the fuzzy label Young belongs to a partition built on the domain of age at- The summarization act considered like a process of knowledge tribute. Linguistic variables associated with the attributes of R. discovery from database, in the sense that it is organized These linguistic variables constitute the new attribute domains according to two following principal steps. The preprocess- used for the rewriting of tuples in the summarization process. ing step which organizes the FDB records in homogeneous For clusters generation, we carry out a fuzzy clustering [27] clusters having common properties. This step gives a certain while benefiting from fuzzy logic. This operation makes it number of clusters for each attribute. Each tuple has values possible to generate, for each attribute, a set of membership in the interval [0..1] representing these membership degrees degrees. In fact, several fuzzy clustering algorithms have been according to the formed clusters. Linguistic labels, which proposed in the literature [28], [29]. are fuzzy partitions, will be assigned on attribute’s domain. Table IV presents the results of fuzzy clustering applied to For the classification on these fuzzy data, a new algorithm, Age and Income attributes. For Age attribute, fuzzy clustering generates two clusters whereas, for Income attribute, there are Let us consider the example in the table III. The queries are three clusters. The minimal value (resp. maximal) of each as follows: cluster corresponds to the lower (resp. higher) interval terminal Q1 of the values of this last. Each cluster of a partition is labeled Select 3 0.5 Income, ProfessionalBackground with a linguistic descriptor provided by the user or a domain From Employees Where Age IN (Young); expert. For this, the following abbreviations are used: and • For Age attribute: YA (Young Age) and AA (Adult Age). Q2 • For Income attribute: PI (Poor Income), MI (Modest Select 3 0.3 ProfessionalBackground Income) and CI (Comfortable Income). From Employees • For Professional background attribute: A (Associate), E Where Income IN (Comfortable) AND Age IN (Young); (Expert) and S (Senior). In a query, descriptors like Young, Comfortable in Q1 Table IV gives the result of fuzzy clustering from data in and Q2 are called required characteristics and embody the table III. properties that a record must consider them as an element of the answers. A query also defines the attributes for which re- TABLE IV quired characteristics exist. The set of these input attributes is DATA CLUSTERING denoted by Inputs(A ). The expected answer is a description IdTuple Age Income ProfessionalBackground Q 0.5 0.5 0.6 0.4 0.7 0.3 over a set of other attributes, denoted by Outputs(AQ). It is t1 YA , AA MI , CI A , E 0.4 0.6 0.4 0.6 0.5 0.5 the complement of Inputs(AQ) relatively to AQ (the set of t2 YA , AA PI , MI A , E 0.7 0.3 0.8 0.8 t3 YA , AA MI A attributes appears in the query Q): 0.2 0.8 0.6 0.4 0.3 0.7 t4 YA , AA PI , MI E , S 0.6 0.4 0.7 0.7 0.3 Inputs(AQ) ∪ Outputs(AQ)= A (3) t5 YA , AA PI A , E 0.5 0.5 0.8 0.4 0.6 t6 YA , AA CI E , S and Inputs(AQ) ∩ Outputs(AQ)= ∅ (4)

IV. AN SQLF-BASED FLEXIBLE SUMMARY QUERYING Hence a query Q defines not only a set Inputs(AQ) of input attributes but also for each attribute A , the set L (Q) A flexible querying process of a DB can be divided into k Ak of its required characteristics which define the set of linguistic three steps [30]: extension of criteria, selection of results and terms of attribute A appears query Q. The set of sets L (Q) ordering. The first step uses similarities between values to k Ak is denoted by L(Q). extend the criteria. It allows graded semantics for any criterion, For example, for Q2, this set is determined as follows: which can now express around 20 instead of being limited to • the binary semantics of equal to 20 or between 18 and 22. The Inputs(AQ2 )= {Income, Age}; • second step, namely the selection of results, determines which Outputs(AQ2 )= {ProfessionalBackground}; data will participate in the answer to the query. The last step • LIncome(Q2)= {RC}, LAge(Q2)= {AJ}; (ordering) follows the extension of criteria. It discriminates the • L(Q2)= {LIncome(Q2,LAge(Q2)}. results on the basis of their relative satisfaction to the graded • The degree of membership to this query is 0.3 and the semantics: a value of 20 is better ranked than a value of 18. number of the desired result is 5. The following works, which are the of flexible B. Fuzzy query in FDB, exemplify the use of fuzzy sets. They are essentially characterized by a tuple-oriented processing, the The query rewriting in a logical proposition Pf (Zf ,Q) used possibility to define new terms and especially the use of to qualify the link between the fuzzy summary Zf and the satisfaction degrees to extract the top-k query. query Q. Pf (Zf ,Q) is in a conjunctive form in which all descriptors are literals. Then, each set of descriptors yields one A. Fuzzy query expression corresponding clause.Thereafter we will apply an α-cut on this In the query SQLf we have two parameter α and k. As new form with the parameters α is calculated previously. previously said parameters k and α from Select clause limits This form is defined as follows: the number of answers by using a quantitative calibration(k min l11 l21 ... l 1 x , min l12 l22 ... l 2 x , ..., best responses) or qualitative calibration(the data that satisfy ( ∨ ∨ ∨ j )( ) ( ∨ ∨ ∨ j )( ) min(l1 ∨ l2 ∨ ... ∨ l )(x) ≥ α the query with a threshold greater than α) . k k jk The parameter k is given by the user and the parameter ⇐⇒ (l11 ∨ l21 ∨ ... ∨ lj1)(x) ≥ α, α is calculated depending on the number of clusters which (l12 ∨ l22 ∨ ... ∨ lj2)(x) ≥ α ,... ,(l1k ∨ l2k ∨ ... ∨ ljk)(x) ≥ α involved in condition of Select query. ⇐⇒ l11(x) ≥ α or l21(x) ≥ α or lj1(x) ≥ α, l12(x) ≥ α or l22(x) ≥ α or lj2)(x) ≥ α, l1k(x) ≥ α or l2k(x) ≥ α or 1 α = (2) ljk(x) ≥ α max(NClus) =⇒ P (Q)=(α − cut(l11) ∨ α − cut(l21) ∨ ... ∨ α − cut(lj1)) where NClus is the number of clusters involved in the ∧(α − cut(l12) ∨ α − cut(l22) ∨ ... ∨ α − cut(lj2)) ∧ ...∧ condition of clause. (α − cut(l1k) ∨ α − cut(l2k) ∨ ... ∨ α − cut(ljk)) Example: Let be consider the query Q3: the α − summary has a membership value µ(ti) ≥ α. It can Select 3 0.3 ProfessionalBackground be formulated as follows: From Employees α − Z = {∀ ti ∈ Z |µ(ti) ≥ α}, with µ(ti) ∈ [0, 1] (5) Where Age IN (Young, Adult) f f And Income IN (Poor, Modest); D. Query evaluation We have then: In this section, we try to evaluate the proposed approach. • Inputs(AQ1 )= {Age, Income}; For this, the searching procedure should take into account • LAge = {Y A, AA}; all fuzzy summaries in the concept lattice that correspond • LIncome = {PI,MI}; to the query Q. The evaluation procedure is based on a • The degree of membership to this query is 0.3 and the generalization search and relies on the property of the concept number of the desired result is 3; lattice hierarchy. The algorithme 1 describes the different • P (Q3) = (0.3 − cut(Y A) ∨ 0.3 − cut(AA)) ∧(0.3 − steps of the fuzzy k-query function where k is the number cut(P I) ∨0.3 − cut(MI)). of answers, Zresult is a list of all α − summary responding Let be consider vf the valuation function. It is obvious that to a query Q. the valuation of Pf (Q) depends on the summary Zf . Thus vf (Pf (Q)Zf ) denotes the valuation of Pf (Q) in the context Algorithm 1 Fuzzy k-query(Zf , k, α, Q) of Zf . LAi (Zf ) the set of linguistic terms that appear in Zf . Require: Zf the fuzzy concept summary, k the number of answers, We can distinguish between three assumptions: α the threshold greater data to satisfy the query Q. • Coresp(Z ,Q) = Exact : v (P (Q) ) = true and Ensure: Result list of the top k α − summary responding to the f f f Zf query Q. L Z L Q : All tuples including in Z verify the Ai ( f ) ⊆ ( ) f 1: Result ⇐ ∅ List of summary with their satisfaction degrees query Q; 2: Zresult ⇐ PertinentResult(Z, 0,α,Q) • Coresp(Zf ,Q) = False : vf (Pf (Q)Zf ) = false : 3: Zresult.first() 4: for i = 1 → k do LAi (Zf ) 6= L(Q) : Linguistic terms appear in Zf do not correspond to terms in query Q; 5: Result.info() ⇐ Zresult.info() 6: Z .next() • Coresp Z ,Q Indecision : i, L Z L Q result ( f )= ∃ Ai ( f ) − ( ) 6= 7: Result.next() ∅: There are some tuples in Zf satisfying Q. 8: end for These situations reflect a global of the matching of a fuzzy summary Zf with a query Q. The algorithm 2 describes the different steps of the Perti- C. Fuzzy k-query nentResult procedure. The idea of the Fuzzy k-query algorithm is to search using Algorithm 2 PertinentResult(Zf , level, α, Q) the summary concept; which summary responds to the query Require: Zf the fuzzy concept summary, level the current level of and calculate their satisfaction degree. Then we will summary, α the threshold greater data to satisfy the query Q. them in a list satisfaction degree. We will repeat these Ensure: Zresult list of all α − summary responding to the query steps until ensure that there is not a branch in the summary Q. concept that satisfies the query. Finally we can display the top 1: Zresult ⇐ ∅ k α − summary from the list of results. Figure 3 shows the 2: if Coresp(Z,Q)= Exact then 3: x.degree ⇐ Calcul − SD(Z, level) principle of our approach. 4: x.sum ⇐ α − summary(Z, α) 5: if Zresult.Empty() then 6: Zresult.Insertprevious(x) 7: else 8: Zresult.first() 9: end if 10: while not(Zresult.offlist()) do 11: if Zresult.degree < x.degree then 12: Zresult.Insertprevious(x) 13: end if 14: Zresult.next() 15: end while 16: else 17: if Coresp(Z,Q)= indecision) then Fig. 3. An SQLf-based flexible summary querying approach step 18: for all Fuzzy summary zf of Zf do 19: Level ⇐ level + 1 20: Zresult ⇐ Zresult + Definition 6. An α−summary, denoted as α−Zf , is a fuzzy PertinentResult(Zf ,level,α,Q) summary Zf = (RZf , IZf ) in which RZf is a collection of 21: end for 22: end if candidate records RZf = {t1,t2, ..., tN } which represents the 23: end if extent and IZf is the intent. Each tuple ti of RZf existing in Calcul−SD is the function to calculate the satisfaction degree [4] P. Bosc and O. Pivert, SQLf: a language for fuzzy querying, IEEE Transactions on Fuzzy Systems, vol(3), pp.1-17, 1995. SD of fuzzy summary Zf .This degree is the max of the road [5] P. Bosc, D. Dubois, O. Pivert and H. Prade, R´esum´es de donn´ees et that lets us find Zf . It is determined as follows: ensembles flous - principes d’une nouvelle approche, In LFA, pp. 333- 340, 2000. SD = max(X Fuzzy − score(Zj , Zj+1)) (6) [6] D.H. Lee and M.H. Kim, Database summarization using fuzzy ISA hierarchies, IEEE Transactions on Systems, Man and Cybernetics-Part with j =1..p, p the current level of Zf and B: Cybernetics, vol. 27, pp. 68-78, 1997. [7] G. Raschia, SaintEtiq: Une approche floue pour la g´en´eration de r´esum´es |(Zp) ∩ (Zp+1)| `apartir de bases de donn´ees relationnelles, Th`ese de doctorat, Universit´e Fuzzy − score(Zp, Zp+1)= (7) |(Zp) ∪ (Zp+1)| de Nantes, 2001. [8] I. BenAli-Sougui, M. Sassi-Hidri and A. Grissa-Touzi, About Summa- Coresp(Z,Q) is the function that allows to test the correspon- rization in Large Fuzzy Databases, The Fifth International Conference dence between the summary Zf and the query Q that we have on Advances in Databases, Knowledge, and Data Applications, pp. 87- seen previously. 94, 2013. [9] R. Wille, Restructuring lattice theory: An approach based on hierarchies α − summary(Z, α) is the function that uses of concepts, In Ivan Rival, editor, Ordered sets, Reidel, Dordrecht-Boston, the definition of α − summary; the result is the pp. 445-470, 1982. α − Z = {∀t ∈ Z,µ(t) ≥ α}. [10] G. Raschia and N. Mouaddib, SaintEtiQ: A fuzzy set-based approach to database summarization, Fuzzy Sets and Systems, vol.129, no. 2, pp. 137162, 2002. The result of applying the algorithm on the concept lattice [11] L.A. Zadeh, Fuzzy Sets, Journal of Information and Control, Vol. 8, for queries Q1, Q2 and Q3 is given in table V. pp.338-353, 1965. [12] W.A. Voglozin, G. Raschia, L. Ughetto and N. Mouaddib, Querying the TABLE V SaintEtiq summaries - a first attempt, FQAS, 2004. [13] T.T. Quan, S.C. Hui,and T.H. Cao, A Fuzzy FCA-based Approach to TOP k α − summary Conceptual Clustering for Automatic Generation of Concept Hierarchy Query α − Summary on Uncertainty Data, Concept Lattices and Their Applications Workshop, 0.5 0.7 0.6 0.5 pp. 1-12, 2004. Q1 α − z13={t1 , t3 , t5 , t6 } 0.4 0.4 0.8 [14] B. Babcock and C. Olston, Distributed top-k monitoring, SIGMOD Q2 α − z42 = {t1 }, α − z34={t1 , t6 } Conference, 2003. 0.5 0.6 0.4 0.5 0.4 0.7 Q3 α − z21={t1 , t2 , t4 }, α − z22={t1 , t2 , t3 }, [15] B. Kimelfeld and Y. Sagiv, Finding and approximating top-k answers 0.4 0.6 0.7 α − z23={t2 , t4 , t5 } in keyword proximity search, PODS Conference, 2006. [16] A. Silberstein, R. Braynard, C.S. Ellis and K. Munagala, A sampling- based approach to optimizing top-k queries in sensor networks, IEEE International Conference on Data Engineering, 2006. V. CONCLUSION [17] M. Wu, J. Xu, X. Tang and W.-C Lee, Monitoring top-k query in wireless sensor networks, IEEE International Conference on Data Engineering, In this paper we have presented a flexible SQLf query based 2006. on fuzzy linguistic summaries. Because, the SQLf query limits [18] A. Grissa-Touzi and H. Ounalli, A New Approach For Top-k Flexible the number of answers by using a quantitative calibration Queries In Large Database Using The Knowledge Discovered, The Fourth International Conference on Advances in Databases, Knowledge, and or qualitative calibration, the mechanism of Fuzzy k-query Data Applications, pp.103-111, 2012. explores a hierarchy of fuzzy summaries. Each fuzzy summary, [19] S. Chaudhuri and L. Gravano and A. Marian, Optimizing top-k selection represented by a fuzzy concept, performs a comparison with queries over multimedia repositories, IEEE Transactions on Knowledge and Data Engineering 16(8), 2004. set-based query descriptors from a predefined vocabulary and [20] P. Ciaccia and M. Patella, Searching in metric spaces with user-defined applies an α − cut. The result of this comparison determines and approximate distances, ACM Transactions on Database Systems, vol. whether the abstract will be part of the answer but it also 27(4), 2002. [21] G.R. Hjaltason and H. Samet, Index-driven similarity search in metric conditions the exploration of a part of the hierarchy. Then spaces, ACM Transactions on Database Systems, vol. 28(4), 2003. using the satisfaction degree that will be calculated for each [22] R. Akbarinia, E. Pacitti and P. Valduriez, Processing top-k queries in result allows us to return the top k result. distributed hash tables. Euro-Par Conf., 2007. [23] K. Mouratidis, S. Bakiras and D. Papadias, Continuous monitoring of In brief, our objective is to found a result even in the case of top-k queries over sliding windows, SIGMOD Conference, 2006. the absence of summaries corresponding strictly to the query. [24] R. Fagin, J. Lotem and M. Naor, Optimal aggregation algorithms for To accommodate this target we will introduce a new kind middleware. J. of Computer and System Sciences 66(4), 2003. [25] M. Sassi, A. Grissa-Touzi, H. Ounelli and I. Aissa, About Database of repaired query. Reparation is based on the idea that there Summarization, International Journal of Uncertainty, Fuzziness and could be a result semantically close to the query. To find these Knowledge-Based Systems, vol. 18(2), pp. 133-151, 2010. results, the query is modified using the best fuzzy summaries [26] A. Grissa-Touzi, An Alternative Extension of the FCM Algorithm for Clustering Fuzzy Databases, The Second International Conference on which is the near value answering the query. Advances in Databases, Knowledge, and Data Applications, pp.135-142, 2010. REFERENCES [27] D. Vanisri and Dr.C. Loganathan, Survey on Fuzzy Clustering and [1] J. Galindo, A.Urrutia and M. Piattini, Fuzzy databases: modeling, design Rule Mining, International Journal of Computer Science and Information and implementation, SA: Idea Group Publishing Hershey, 2006. Security 8(4), pp. 183-187, 2010. [2] M.A. Ben Hassine, A. Grissa Touzi, J. Galindo and H. Ounelli, How to [28] M. Sassi, Towards Fuzzy-Hard Clustering Mapping Processes, Advances Achieve Fuzzy Relational Databases, Handbook of Research on Fuzzy in Fuzzy Sets and Systems 9(1), pp. 37-63, 2011. Information Processing in Databases, Information Science Reference, pp. [29] R. Pradeep and S. Shubha, A Survey of Clustering Techniques, Interna- 351-380, 2008. tional Journal of Computer Applications 7(12), pp. 1-5, 2010. [3] P. Bosc, L. Litard and O. Pivert, Bases de donn´ees et Flexibilit´e: Les [30] H.L. Larsen, An approach to flexible information access systems using 32nd requtes Graduelles, Techniques et Sciences informatiques, vol. 7(3), pp. soft computing, Annual Hawaii International Conference on System 355-378, 1998. Sciences 6042, 1999.