Concept and Implementation of a Fuzzy Data Warehouse

Concept and Implementation of a Fuzzy Data Warehouse Thesis presented to the Faculty of Economics and Social Sciences at the University of Fribourg (Switzerland), in fulfillment of the requirements for the degree of Doctor of Economics and Social Sciences by Daniel Fasel from St. Ursen FR Accepted by the Faculty of Economics and Social Sciences on 30.05.2012 at the proposal of Prof. Dr. Andreas Meier (First Advisor) and Prof. Dr. Ulrich Ultes-Nitsche (Second Advisor) Fribourg, Switzerland 2012 The Faculty of Economics and Social Sciences at the University of Fribourg neither approves nor disapproves the opinions expressed in a doctoral dissertation. They are to be considered those of the author (Decision of the Faculty Council of 23 January 1990). ii For my daughter Leola Mai Ly iii Acknowledgment I would like to thank Prof. Dr. Andreas Meier. Without his support and encouragement I would never have started this thesis, much less I would have finished it. During the PhD study he always provided invaluable input for the thesis with his academic advices. His enthusiasm has been a great motivation during this time. I also wish to thank Prof. Dr. Ulrich Ultes-Nitsche for his constructive feedback. I would also like to express my deepest gratitude to Khurram Shahzad. Without his inputs, this thesis would never be in the shape as it is now. The time I spent for research with Khurram has been the most productive moments during my PhD study. In Khurram, I found a very very good friend and I am more than glad that I met Khurram through our common research interests. I would also like to express my gratitude to my research colleagues in the Information System Research Group and in the Department of Informatics of the University of Fri- bourg. Many thanks go to Kristen Curtis and my brother Thomas, who read the thesis and helped me a lot improving the language. Last, I would like to thank my wife Stéphanie and my daughter Leola. Without your patience and support I would never have had the energy to write this thesis. You are the essence in my life and the reason for writing this thesis, for doing what I am doing. With every single word in this thesis I would like to tell you: I love you! iv Contents Contents 1. Introduction 1 1.1. Motivation . 1 1.2. Research Methodology . 2 1.3. Chapter Overview . 4 1.4. Publications . 6 I. Concept 9 2. Fundamental Concepts 10 2.1. Data Warehouse Concepts . 10 2.1.1. Dimension . 12 2.1.2. Fact . 13 2.1.3. Summarizability . 15 2.1.4. Star and Snowflake Schema . 17 2.1.5. Classical Operations . 19 2.2. Concepts of Fuzzy Logic . 31 2.2.1. Fuzzy Set Theory . 32 2.2.2. Linguistic Concepts . 38 2.2.3. Application of Fuzzy Logic . 41 3. Fuzzy Data Warehouse 44 3.1. Existing Research . 44 3.1.1. Data Warehouse Approaches for Handling Imprecise Data . 45 3.1.2. Approaches for Implementing Fuzziness into Data Warehouse . 46 3.1.3. The Feng and Dillon Framework for Implementing Fuzziness into Data Warehouse . 48 3.1.4. Evaluation and Comparison of the Existing Approaches . 49 3.2. Fuzzy Data Warehouse Concept . 59 3.2.1. Basic Definitions and Fuzzy Meta Tables . 60 3.2.2. Fuzzy Data Warehouse Model . 61 3.2.3. Guidelines for Modeling the Fuzzy Data Warehouse . 62 3.2.4. The Fuzzy Data Warehouse Meta Model . 64 3.3. A Method for Modeling a Fuzzy Data Warehouse . 65 3.3.1. Defining Classification Elements . 66 3.3.2. Building Fuzzy Data Warehouse Model . 70 v Contents 3.4. Characteristics of Fuzzy Concepts in Fuzzy Data Warehouse . 73 3.4.1. Types of Fuzzy Concepts . 74 3.4.2. Aggregation and Propagation of Fuzzy Concepts . 81 3.4.3. Persistency of Target Attributes . 88 3.4.4. Metaschema for Fuzzy Concepts . 91 3.4.5. Calculation of Membership Function . 93 3.5. Operations in Fuzzy Data Warehouse . 100 3.5.1. Classical Data Warehouse Operations in Fuzzy Data Warehouse . 103 3.5.2. Fuzzifying and Defuzzifying Cubes . 111 3.5.3. Aggregations with Fuzzy Concept . 112 II. Application 119 4. Application of Fuzzy Data Warehouse 120 4.1. The Movie Rental Company . 120 4.2. Integration of Fuzzy Concepts in the Data Warehouse . 128 4.2.1. Dimension Movie . 128 4.2.2. Dimension Customer . 130 4.2.3. Dimension Employee . 132 4.2.4. Dimension Store . 133 4.2.5. Fact Revenue . 135 4.2.6. Fact User Rating . 138 4.2.7. Fuzzy Data Warehouse Schema . 140 4.3. Using the Fuzzy Data Warehouse . 141 III. Implementation 155 5. Implementation 156 5.1. Architectural Overview . 156 5.2. Database . 157 5.3. Business Logic . 163 5.3.1. XML Schema for Dimensions and Facts . 164 5.3.2. XML Schema for Fuzzy Concepts . 173 5.3.3. The Query and the Fuzzy Concept Administration Engine . 178 5.4. Visualization . 180 5.4.1. Navigation of the Fuzzy Data Warehouse . 180 5.4.2. Administration of Fuzzy Concepts . 184 vi Contents IV. Evaluation and Conclusion 187 6. Evaluation and Conclusion 188 6.1. Evaluation . 188 6.1.1. Concept . 188 6.1.2. Application and Implementation . 191 6.1.3. Further Outlook . 192 6.2. Conclusion . 193 A. The XML Schema of the Crisp Part 207 B. The XML Schema of the Fuzzy Part 211 C. The XML Document of the Crisp Fuzzy Data Warehouse Elements 214 D. The XML Document of the Fuzzy Concepts 225 vii List of Figures List of Figures 2.1. Architecture of Data Warehouse adapted from [CD97] . 11 2.2. Example Dimension “Purchase Order Structure” . 14 2.3. Disjointness Condition for Dimension Student Affiliation . 16 2.4. Completeness Condition in a Dimension “Course” . 17 2.5. Star Schema . 18 2.6. Snowflake Schema . 20 2.7. Possible Result Set of Query in Listing 2.1 . 22 2.8. Possible Result Set of Query in Listing 2.2 . 22 2.9. R of Basic Cube . 23 2.10. R of Cube . 24 2.11. Cube according Argrawal et al. [AGS97] . 25 2.12. Restriction Operation applied to a two dimensional Cube . 26 2.13. Example Data Warehouse Snowflake Schema . 27 2.14. Fuzzy Subsets of Linguistic Terms "very short", "short", "medium", "long" and "very long" . 40 2.15. Hierarchical Representation of a Linguistic Variable . 40 3.1. Structure of Chapter 3 . 45 3.2. Fuzzy Data Warehouse Meta Model . 64 3.3. A Graphical Overview of the Method for Modeling a Fuzzy Data Warehouse 66 3.4. Dimension Customer . 66 3.5. Dimension Customer with Fuzzy Concepts . 73 3.6. Schematic Example of an Open End Fuzzy Concept . 74 3.7. Customer and Employee Dimension . 76 3.8. Example of Open End Fuzzy Concept . 77 3.9. Schematic Representation of a Limited Fuzzy Concept . 77 3.10. Example of Limited Fuzzy Concept . 79 3.11. Schematic Representation of an Adaptive Fuzzy Concept . 80 3.12. Example of Adaptive Fuzzy Concept . 81 3.13. Dimension Store with Fuzzy Concept Store Surface . 82 3.14. Aggregation of a Fuzzy Concept . 83 3.15. Wrong Aggregation of Fuzzy Concept using Summation . 84 3.16. Wrong Aggregation of Fuzzy Concept using Average . 85 3.17. Propagation of Fuzzy Concept Store Surface . 86 3.18. Dimension Store and Fact Revenue with Fuzzy Concepts . 87 3.19. Propagation of a Fuzzy Concept . 88 viii List of Figures 3.20. Dimensions Time and Store, Fact Revenue and Fuzzy Concept Revenue . 90 3.21. Representation of Revenue per City and Time . 91 3.22. Membership Function for two Fuzzy Classes . 93 3.23. Example Fuzzy Data Warehouse Snowflake Schema including Fuzzy Meta Tables . 102 3.24. Multivalued Dimension with a Bridge Table . 113 3.25. Fuzzy Data Warehouse Schema with Dimension Customer, Movie, Time, with Facts User Rating and Revenue . 117 4.1. Snowflake Scheme of the Data Warehouse . 121 4.2. Fuzzy Concept Movie Genre . 129 4.3. Fuzzy Concepts Customer Revenue and Customer Age . 132 4.4. Fuzzy Concept Employee Age . 133 4.5. Fuzzy Concept Store Surface . 134 4.6. Propagated Fuzzy Concepts City Store Surface and Region Store Surface 136 4.7. Fuzzy Concept Revenue . 137 4.8. Propagated Fuzzy Concepts Revenue . 139 4.9. Fuzzy Concept User Rating . 140 4.10. Fuzzy Data Warehouse Schema for the Movie Rental Company . 141 5.1. Overview of Prototype Architecture . 158 5.2. Node Relation “dwh”, “cube”, “fact” and “dimensions” . 165 5.3. Node “fact” and its Children . 165 5.4. Node “dimensions” and its Children . 166 5.5. Node “hierarchy” and its Children . ..

Load more