A Pattern Model and Algebra for Representing and Querying Relative Information Completeness

Total Page:16

File Type:pdf, Size:1020Kb

A Pattern Model and Algebra for Representing and Querying Relative Information Completeness Sorbonne Université Laboratoire Informatique Paris 6 THÈSE DE DOCTORAT DISCIPLINE: INFORMATIQUE A Pattern Model and Algebra for Representing and Querying Relative Information Completeness par Fatma-Zohra HANNOU Rapporteurs: Nicole BIDOIT-TOLLU Professeur, Université Paris-Sud Dimitris KOTZINOS Professeur, Université Cergy Pontoise Examinateurs: Ladjel BELLATRECHE Professeur, ENSMA, Poitiers Laure BERTI-EQUILLE Professeur, Université Aix-Marseille Christophe MARSALA Professeur, Sorbonne Université Directeur de thèse: Bernd AMANN Professeur, Sorbonne Université Encadrant de thèse: Mohamed-Amine BAAZIZI MdC, Sorbonne Université ”Have no fear of perfection, you will never reach it.” Marie Skłodowska-Curie Nobel Prizes in Physics and Chemistry Abstract Information incompleteness is a major data quality issue which is amplified by the increasing amount of data collected from unreliable sources. Assessing the completeness of data is crucial for determining the quality of the data itself, but also for verifying the validity of query answers over incomplete data. While there exists an important amount of work on modeling data completeness, deriving this completeness information has not received much attention. In this work, we tackle the issue of extracting and reasoning about complete and missing information under relative information completeness setting. Under this setting, the completeness of a dataset is assessed with respect to a complete reference dataset. Few works have been dedicated to representing data completeness under this setting, and we advance the field by proposing two contributions: a pattern model for providing minimal covers summarizing the extent of complete and missing data partitions and a pattern algebra for deriving minimal pattern covers for query answers to analyze their validity. The completeness pattern framework presents an intriguing opportunity to achieve many applications, particularly those aiming at improving the quality of tasks impacted by missing data. In our work, we address the problem of repairing query results obtained from incomplete data. Data imputation is a well-known technique for repairing missing data values but can incur a prohibitive cost when applied to large data sets. Query-driven imputation offers a better alternative as it allows for fixing only the data that is relevant for a query. We adopt a rule-based query rewriting technique for imputing the answers of analytic queries that are missing or suffer from incorrectness due to data incompleteness. We present a novel query rewriting mechanism that is guided by the completeness pattern model and algebra. Our solution strives to infer the broadest possible set of missing answers while improving the precision of incorrect ones. In the last contribution, we investigate the generalization of our pattern model for summarizing any data fragments. The generalized pattern model can be used to produce pattern summaries of data fragments over any subset of attributes and these summaries can be queried to analyze and compare data fragments in a synthetic and flexible way. Keywords: Relative Information, Completeness Assessment, Pattern model, Pattern Algebra, Imputation, Summarization v Résumé L’incomplétude des données est un problème majeur de qualité qui s’amplifie par la quantité croissante de données collectées par des sources peu fiables. L’évaluation de l’exhaustivité des données est cruciale pour déterminer leur qualité mais aussi la validité des réponses de requêtes qui en découlent. Dans le contexte de l’information relative, la complétude d’une base de données est évaluée en comparaison à une base référence. Nous apportons deux principales contributions à ce domaine: un modèle de motifs produisant des couvertures minimales résumant l’étendue des partitions de données complètes et manquantes, ainsi qu’une algèbre de motifs permettant de dériver des couvertures minimales pour l’analyse de la validité des réponses des requêtes. Ce modèle de motifs offre une opportunité intéressante pour réaliser de nombreuses applications, en particulier celles visant à améliorer la qualité des tâches affectées par les données manquantes. Nous adoptons une technique de réécriture de requêtes à base de règles pour imputer les réponses des requêtes d’agrégation manquantes ou présentant des valeurs incorrectes. Nous étudions également la généralisation de notre modèle de motifs pour effectuer la synthèse des fragments de données. Les résumés peuvent être interrogés pour analyser et comparer les fragments de données de manière synthétique et flexible. Mots Clés: Information Relative, Complétude de données, Modèle de motifs, Algèbre de motifs, Imputation, Synthétisation vii Dédicaces À Nour-Filastine, Pour toutes les fois où je lisais en te berçant, Où je réfléchissais en te nourrissant, Où j’écrivais en te parlant, Pour toutes les fois où cette thèse a partagé nos petits moments, Pour toutes les fois où je t’ai demandé de m’aider sans que tu comprennes, Où sur ce document tu as tenté d’écrire sans que tu y parviennes. Après dieu, qui remercier si ce n’est toi, tu m’as appris le sens de la force, l’espoir, la détermination, l’amour... Peu de gens peuvent comprendre, mais tu m’as fait sourire lors de mes échecs, et pleurer de bonheur avec ton seul sourire... Les matins parisiens glacials, le son de ta poussette, tes larmes la nuit, la fièvre, les dents, tes pas, tes mots... Je pourrais écrire un livre juste pour te dire merci, pour tout ce que tu m’as apporté, mais peu de mots ressembleront à ce que je ressens. Ma chère fille, Depuis que tu es née, chaque jour, c’est pour toi que je m’interdisais de lâcher, Pour toi que je refusais d’abandonner, C’est pour toi ma chérie que maman a continuée, Pour qu’un jour tu saches que dans la vie, on ne laisse jamais tomber, Dans la vie il faut sans cesse repousser ses limites, Aujourd’hui, toutes les deux, nous l’avons fait.... Comme pour mes jours et mes nuits... je te dédie cette thèse Maman ix Remerciements J’exprime mon profond respect et ma parfaite gratitude envers mon directeur de thèse, Bernd Amann. D’abord, pour la qualité de ses conseils et directives scientifiques, ainsi que sa disponibilité malgré toutes ses responsabilités. Bernd n’a jamais économisé son temps pour me prodiguer ses précieux conseils, et débattre de mes idées. Au-delà de son expertise, je reste impressionnée par sa qualité d’écoute et sa capacité à mettre ses étudiants en confiance pour défendre leurs propositions. Je le remercie pour ses encouragements et sa confiance en moi, pendant toutes les périodes parfois difficiles de ma thèse. Bernd a toujours su me motiver, me pousser au-delà de mes limites, me rassurer quand la science se compliquait, et partager ma joie quand elle me souriait. Qu’il trouve ici l’expression de ma profonde reconnaissance. Je remercie également mon encadrant de thèse, Mohamed-Amine Baazizi, pour ses conseils scientifiques, ses relectures soignées, et sa participation aux travaux de recherche. Je remercie les professeurs Nicole Bidoit-Tollu et Dimitris-Kotzinos d’avoir accepté d’être les rapporteurs de mon manuscrit, et pour la qualité de leurs remarques malgré des délais serrés. Je remercie également tous les membres du Jury, pour m’avoir fait l’honneur d’assister à ma soutenance afin d’évaluer mon travail et me faire profiter de leur grande expertise. Je remercie l’équipe du projet Ebita, qui a permis de financer cette thèse, et ses membres qui ont participé à l’amorce de cette thèse, à enrichir le débat sur les problématiques scientifiques que j’ai traité. J’ai eu la chance d’avoir passé mes années de thèses au sein de l’équipe Base de Données du laboratoire LIP6, dans un environnement propice à l’apprentissage. J’ai beaucoup appris en côtoyant les talentueux chercheurs de mon équipe que je remercie pour leur accueil et convivialité. Je garderai de ma thèse les agréables souvenirs de nos riches discussions sur la science et parfois la vie, avec ou sans caféine... Merci à Anne, Camélia, Hubert, Li Ke et Stéphane. xi Quoi que j’ai pu accomplir dans ma modeste vie, j’ai trouvé à mes côtés une famille aimante et supportrice qui m’a toujours donné l’envie de persévérer. Je salue d’abord la mémoire de ma grande-mère qui m’a inculqué les plus belles valeurs, et donné les plus beaux exemples de courage. J’aurai aimé te dire que j’ai franchis une nouvelle étape, que dieu t’accueille dans son vaste paradis. Papa À mon héros, à celui qui m’a appris ce qu’aucune école ne m’a apprise, ni aucun livre ne m’a transmis, Merci pour tes sacrifices, ton amour, ta sagesse et ton soutien, Merci parce que tu m’as appris depuis mes premiers pas, comment grimper les plus hautes mon- tagnes et viser toujours plus loin que le ciel... Maman À ma fidèle supportrice et ma meilleure amie, merci pour ton amour inconditionnel et ton soutien indéfectible. Je sens tes prières avec chaque pas je mets en avant et tu me rassures plus que tout le monde dans mes moments de doutes. Je resterai à jamais ta fille qui ne grandira qu’aux yeux des autres... Ilhem, Meriem, Hadjer, Mouloud, Yasmine Même séparés de centaines de kilomètres, votre joie de vivre, votre énergie débordante et votre amour m’éclairent tous les jours, C’est vous que j’appelle toujours avant mes épreuves, et vous que j’appelle d’abord pour partager mes réussites, Peu importe où la vie nous mènent, je porterai nos promesses enfantines comme une étoile au cou, Ma fierté est sans égale... Mohammed Merci d’être mon pilier et mon repère, Pour tout ce qu’on a partagé ensemble, et ce qu’il nous reste à bâtir, Pour ton soutien, ta confiance, ton enthousiasme à l’égard de mes travaux, Merci pour la fierté sincère qui brille dans ton regard, Ensemble on ira aussi loin que nos rêves les plus fous..
Recommended publications
  • PML, an Object Oriented Process Modelling Language
    PML, an Object Oriented Process Modeling Language Prof. Dr.-Ing. Reiner Anderl 1, and Dipl.-Ing. Jochen Raßler 2 1 Prof. Dr.-Ing. Reiner Anderl, Germany, [email protected] 2 Dipl.-Ing. Jochen Raßler, Germany, [email protected] Abstract: Processes are very important for the success within many business fields. They define the proper application of methods, technologies, tools and company structures in order to reach business goals. Important processes to be defined are manufacturing processes or product development processes for example to guarantee the company’s success. Over the last decades many process modeling languages have been developed to cover the needs of process modeling. Those modeling languages have several limitations, mainly they are still procedural and didn’t follow the paradigm change to object oriented modeling and thus often lead to process models, which are difficult to maintain. In previous papers we have introduced PML, Process Modeling Language, and shown it’s usage in process modeling. PML is derived from UML and hence fully object oriented and uses modern modeling techniques. It is based on process class diagrams that describe methods and resources for process modeling. In this paper the modeling language is described in more detail and new language elements will be introduced to develop the language to a generic usable process modeling language. Keywords: process modeling language, PML, UML 1. Introduction As the tendency of enterprises to collaborate growths steadily, industry faces new challenges managing business processes, product development processes, manufacturing processes and much more. Furthermore, discipline spanning product development processes are increasing, e.
    [Show full text]
  • Data Model Standards and Guidelines, Registration Policies And
    Data Model Standards and Guidelines, Registration Policies and Procedures Version 3.2 ● 6/02/2017 Data Model Standards and Guidelines, Registration Policies and Procedures Document Version Control Document Version Control VERSION D ATE AUTHOR DESCRIPTION DRAFT 03/28/07 Venkatesh Kadadasu Baseline Draft Document 0.1 05/04/2007 Venkatesh Kadadasu Sections 1.1, 1.2, 1.3, 1.4 revised 0.2 05/07/2007 Venkatesh Kadadasu Sections 1.4, 2.0, 2.2, 2.2.1, 3.1, 3.2, 3.2.1, 3.2.2 revised 0.3 05/24/07 Venkatesh Kadadasu Incorporated feedback from Uli 0.4 5/31/2007 Venkatesh Kadadasu Incorporated Steve’s feedback: Section 1.5 Issues -Change Decide to Decision Section 2.2.5 Coordinate with Kumar and Lisa to determine the class words used by XML community, and identify them in the document. (This was discussed previously.) Data Standardization - We have discussed on several occasions the cross-walk table between tabular naming standards and XML. When did it get dropped? Section 2.3.2 Conceptual data model level of detail: changed (S) No foreign key attributes may be entered in the conceptual data model. To (S) No attributes may be entered in the conceptual data model. 0.5 6/4/2007 Steve Horn Move last paragraph of Section 2.0 to section 2.1.4 Data Standardization Added definitions of key terms 0.6 6/5/2007 Ulrike Nasshan Section 2.2.5 Coordinate with Kumar and Lisa to determine the class words used by XML community, and identify them in the document.
    [Show full text]
  • Using Telelogic DOORS and Microsoft Visio to Model and Visualize Complex Business Processes
    Using Telelogic DOORS and Microsoft Visio to Model and Visualize Complex Business Processes “The Business Driven Application Lifecycle” Bob Sherman Procter & Gamble Pharmaceuticals [email protected] Michael Sutherland Galactic Solutions Group, LLC [email protected] Prepared for the Telelogic 2005 User Group Conference, Americas & Asia/Pacific http://www.telelogic.com/news/usergroup/us2005/index.cfm 24 October 2005 Abstract: The fact that most Information Technology (IT) projects fail as a result of requirements management problems is common knowledge. What is not commonly recognized is that the widely haled “use case” and Object Oriented Analysis and Design (OOAD) phenomenon have resulted in little (if any) abatement of IT project failures. In fact, ten years after the advent of these methods, every major IT industry research group remains aligned on the fact that these projects are still failing at an alarming rate (less than a 30% success rate). Ironically, the popularity of use case and OOAD (e.g. UML) methods may be doing more harm than good by diverting our attention away from addressing the real root cause of IT project failures (when you have a new hammer, everything looks like a nail). This paper asserts that, the real root cause of IT project failures centers around the failure to map requirements to an accurate, precise, comprehensive, optimized business model. This argument will be supported by a using a brief recap of the history of use case and OOAD methods to identify differences between the problems these methods were intended to address and the challenges of today’s IT projects.
    [Show full text]
  • Principles of the Concept-Oriented Data Model Alexandr Savinov
    Principles of the Concept-Oriented Data Model Alexandr Savinov Institute of Mathematics and Computer Science Academy of Sciences of Moldova Academiei 5, MD-2028 Chisinau, Moldova Fraunhofer Institute for Autonomous Intelligent System Schloss Birlinghoven, 53754 Sankt Augustin, Germany http://www.conceptoriented.com [email protected] Principles of the Concept-Oriented Data Model Alexandr Savinov Institute of Mathematics and Computer Science, Academy of Sciences of Moldova Academiei 5, MD-2028 Chisinau, Moldova Fraunhofer Institute for Autonomous Intelligent System Schloss Birlinghoven, 53754 Sankt Augustin, Germany http://www.conceptoriented.com [email protected] In the paper a new approach to data representation and manipulation is described, which is called the concept-oriented data model (CODM). It is supposed that items represent data units, which are stored in concepts. A concept is a combination of superconcepts, which determine the concept’s dimensionality or properties. An item is a combination of superitems taken by one from all the superconcepts. An item stores a combination of references to its superitems. The references implement inclusion relation or attribute- value relation among items. A concept-oriented database is defined by its concept structure called syntax or schema and its item structure called semantics. The model defines formal transformations of syntax and semantics including the canonical semantics where all concepts are merged and the data semantics is represented by one set of items. The concept-oriented data model treats relations as subconcepts where items are instances of the relations. Multi-valued attributes are defined via subconcepts as a view on the database semantics rather than as a built-in mechanism.
    [Show full text]
  • Metamodeling the Enhanced Entity-Relationship Model
    Metamodeling the Enhanced Entity-Relationship Model Robson N. Fidalgo1, Edson Alves1, Sergio España2, Jaelson Castro1, Oscar Pastor2 1 Center for Informatics, Federal University of Pernambuco, Recife(PE), Brazil {rdnf, eas4, jbc}@cin.ufpe.br 2 Centro de Investigación ProS, Universitat Politècnica de València, València, España {sergio.espana,opastor}@pros.upv.es Abstract. A metamodel provides an abstract syntax to distinguish between valid and invalid models. That is, a metamodel is as useful for a modeling language as a grammar is for a programming language. In this context, although the Enhanced Entity-Relationship (EER) Model is the ”de facto” standard modeling language for database conceptual design, to the best of our knowledge, there are only two proposals of EER metamodels, which do not provide a full support to Chen’s notation. Furthermore, neither a discussion about the engineering used for specifying these metamodels is presented nor a comparative analysis among them is made. With the aim at overcoming these drawbacks, we show a detailed and practical view of how to formalize the EER Model by means of a metamodel that (i) covers all elements of the Chen’s notation, (ii) defines well-formedness rules needed for creating syntactically correct EER schemas, and (iii) can be used as a starting point to create Computer Aided Software Engineering (CASE) tools for EER modeling, interchange metadata among these tools, perform automatic SQL/DDL code generation, and/or extend (or reuse part of) the EER Model. In order to show the feasibility, expressiveness, and usefulness of our metamodel (named EERMM), we have developed a CASE tool (named EERCASE), which has been tested with a practical example that covers all EER constructors, confirming that our metamodel is feasible, useful, more expressive than related ones and correctly defined.
    [Show full text]
  • Extending and Evaluating the Model-Based Product Definition
    NIST GCR 18-015 Extending and Evaluating the Model-based Product Definition Nathan W. Hartman Jesse Zahner Purdue University This publication is available free of charge from: https://doi.org/10.6028/NIST.GCR.18-015 NIST GCR 18-015 Extending and Evaluating the Model-based Product Definition Prepared for Thomas D. Hedberg, Jr. Allison Barnard Feeney U.S. Department of Commerce Engineering Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899-8260 By Nathan W. Hartman Jesse Zahner PLM Center of Excellence Purdue University This publication is available free of charge from: https://doi.org/10.6028/NIST.GCR.18-015 December 2017 U.S. Department of Commerce Wilbur L. Ross, Jr., Secretary National Institute of Standards and Technology Walter Copan, NIST Director and Undersecretary of Commerce for Standards and Technology Disclaimer Any opinions, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect the views of the National Institute of Standards and Technology (NIST). Additionally, neither NIST nor any of its employees make any warranty, expressed or implied, nor assume any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, product, or process included in this publication. The report was prepared under cooperative agreement 70NANB15H311 between the National Institute of Standards and Technology and Purdue University. The statements and conclusions contained in this report are those of the authors and do not imply recommendations or endorsements by the National Institute of Standards and Technology. Certain commercial systems are identified in this report. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology.
    [Show full text]
  • The Importance of Data Models in Enterprise Metadata Management
    THE IMPORTANCE OF DATA MODELS IN ENTERPRISE METADATA MANAGEMENT October 19, 2017 © 2016 ASG Technologies Group, Inc. All rights reserved HAPPINESS IS… SOURCE: 10/18/2017 TODAY SHOW – “THE BLUE ZONE OF HAPPINESS” – DAN BUETTNER • 3 Close Friends • Get a Dog • Good Light • Get Religion • Get Married…. Stay Married • Volunteer • “Money will buy you Happiness… well, it’s more about Financial Security” • “Our Data Models are now incorporated within our corporate metadata repository” © 2016 ASG Technologies Group, Inc. All rights reserved 3 POINTS TO REMEMBER SOURCE: 10/19/2017 DAMA NYC – NOONTIME SPEAKER SLOT – MIKE WANYO – ASG TECHNOLOGIES 1. Happiness is individually sought and achievable © 2016 ASG Technologies Group, Inc. All rights reserved 3 POINTS TO REMEMBER SOURCE: 10/19/2017 DAMA NYC – NOONTIME SPEAKER SLOT – MIKE WANYO – ASG TECHNOLOGIES 1. Happiness is individually sought and achievable 2. Data Models can in be incorporated into your corporate metadata repository © 2016 ASG Technologies Group, Inc. All rights reserved 3 POINTS TO REMEMBER SOURCE: 10/19/2017 DAMA NYC – NOONTIME SPEAKER SLOT – MIKE WANYO – ASG TECHNOLOGIES 1. Happiness is individually sought and achievable 2. Data Models can in be incorporated into your corporate metadata repository 3. ASG Technologies can provide an overall solution with services to accomplish #2 above and more for your company. © 2016 ASG Technologies Group, Inc. All rights reserved AGENDA Data model imports to the metadata collection View and search capabilities Traceability of physical and logical
    [Show full text]
  • Towards the Universal Spatial Data Model Based Indexing and Its Implementation in Mysql
    Towards the universal spatial data model based indexing and its implementation in MySQL Evangelos Katsikaros Kongens Lyngby 2012 IMM-M.Sc.-2012-97 Technical University of Denmark Informatics and Mathematical Modelling Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.imm.dtu.dk IMM-M.Sc.: ISSN XXXX-XXXX Summary This thesis deals with spatial indexing and models that are able to abstract the variety of existing spatial index solutions. This research involves a thorough presentation of existing dynamic spatial indexes based on R-trees, investigating abstraction models and implementing such a model in MySQL. To that end, the relevant theory is presented. A thorough study is performed on the recent and seminal works on spatial index trees and we describe their basic properties and the way search, deletion and insertion are performed on them. During this effort, we encountered details that baffled us, did not make the understanding the core concepts smooth or we thought that could be a source of confusion. We took great care in explaining in depth these details so that the current study can be a useful guide for a number of them. A selection of these models were later implemented in MySQL. We investigated the way spatial indexing is currently engineered in MySQL and we reveal how search, deletion and insertion are performed. This paves the path to the un- derstanding of our intervention and additions to MySQL's codebase. All of the code produced throughout this research was included in a patch against the RDBMS MariaDB.
    [Show full text]
  • HANDLE - a Generic Metadata Model for Data Lakes
    HANDLE - A Generic Metadata Model for Data Lakes Rebecca Eichler, Corinna Giebler, Christoph Gr¨oger, Holger Schwarz, and Bernhard Mitschang In: Big Data Analytics and Knowledge Discovery (DaWaK) 2020 BIBTEX: @inproceedingsfeichler2020handle, title=fHANDLE-A Generic Metadata Model for Data Lakesg, author=fEichler, Rebecca and Giebler, Corinna and Gr{\"o\}ger,Christoph and Schwarz, Holger and Mitschang, Bernhardg, booktitle=fInternational Conference on Big Data Analytics and Knowledge Discovery (DaWaK) 2020g, pages=f73{88g, year=f2020g, organization=fSpringerg, doi=f10.1007/978-3-030-59065-9 7g g c by Springer Nature This is the author's version of the work. The final authenticated version is avail- able online at: https://doi.org/10.1007/978-3-030-59065-9 7 HANDLE - A Generic Metadata Model for Data Lakes Rebecca Eichler1, Corinna Giebler1, Christoph Gr¨oger2, Holger Schwarz1, and Bernhard Mitschang1 1 University of Stuttgart, Universit¨atsstraße38, 70569 Stuttgart, Germany [email protected] 2 Robert Bosch GmbH, Borsigstraße 4, 70469 Stuttgart, Germany [email protected] Abstract The substantial increase in generated data induced the de- velopment of new concepts such as the data lake. A data lake is a large storage repository designed to enable flexible extraction of the data's value. A key aspect of exploiting data value in data lakes is the collec- tion and management of metadata. To store and handle the metadata, a generic metadata model is required that can reflect metadata of any potential metadata management use case, e.g., data versioning or data lineage. However, an evaluation of existent metadata models yields that none so far are sufficiently generic.
    [Show full text]
  • XV. the Entity-Relationship Model
    Information Systems Analysis and Design csc340 XV. The Entity-Relationship Model The Entity-Relationship Model Entities, Relationships and Attributes Cardinalities, Identifiers and Generalization Documentation of E-R Diagrams and Business Rules 2002 John Mylopoulos The Entity-Relationship Model -- 1 Information Systems Analysis and Design csc340 The Entity Relationship Model The Entity-Relationship (ER) model is a conceptual data model, capable of describing the data requirements for a new information system in a direct and easy to understand graphical notation. Data requirements are described in terms of a conceptualconceptual (or, ER) schema. ER schemata are comparable to class diagrams. 2002 John Mylopoulos The Entity-Relationship Model -- 2 Information Systems Analysis and Design csc340 The Constructs of the E-R Model AND/XOR 2002 John Mylopoulos The Entity-Relationship Model -- 3 Information Systems Analysis and Design csc340 Entities These represent classes of objects (facts, things, people,...) that have properties in common and an autonomous existence. City, Department, Employee, Purchase and Sale are examples of entities for a commercial organization. An instance of an entity is an object in the class represented by the entity. Stockholm, Helsinki, are examples of instances of the entity City, and the employees Peterson and Johanson are examples of instances of the Employee entity. The E-R model is very different from the relational model in which it is not possible to represent an object without knowing its properties (an employee is represented by a tuple containing the name, surname, age, and other attributes.) 2002 John Mylopoulos The Entity-Relationship Model -- 4 Information Systems Analysis and Design csc340 Examples of Entities 2002 John Mylopoulos The Entity-Relationship Model -- 5 Information Systems Analysis and Design csc340 Relationships They represent logical links between two or more entities.
    [Show full text]
  • Patterns for Slick Database Applications Jan Christopher Vogt, EPFL Slick Team
    Patterns for Slick database applications Jan Christopher Vogt, EPFL Slick Team #scalax Recap: What is Slick? (for ( c <- coffees; if c.sales > 999 ) yield c.name).run select "COF_NAME" from "COFFEES" where "SALES" > 999 Agenda • Query composition and re-use • Getting rid of boiler plate in Slick 2.0 – outer joins / auto joins – auto increment – code generation • Dynamic Slick queries Query composition and re-use For-expression desugaring in Scala for ( c <- coffees; if c.sales > 999 ) yield c.name coffees .withFilter(_.sales > 999) .map(_.name) Types in Slick class Coffees(tag: Tag) extends Table[C](tag,"COFFEES”) { def * = (name, supId, price, sales, total) <> ... val name = column[String]("COF_NAME", O.PrimaryKey) val supId = column[Int]("SUP_ID") val price = column[BigDecimal]("PRICE") val sales = column[Int]("SALES") val total = column[Int]("TOTAL") } lazy val coffees = TableQuery[Coffees] <: Query[Coffees,C] coffees(coffees.map ).map( :TableQuery(c => c.[Coffees,_]name) ).map( (c ) => ((c:Coffees) => (c.name ) : Column[String]) ): Query[Column[String],String] Table extensions class Coffees(tag: Tag) extends Table[C](tag,"COFFEES”) { … val price = column[BigDecimal]("PRICE") val sales = column[Int]("SALES") def revenue = price.asColumnOf[Double] * sales.asColumnOf[Double] } coffees.map(c => c.revenue) Query extensions implicit class QueryExtensions[T,E] ( val q: Query[T,E] ){ def page(no: Int, pageSize: Int = 10) : Query[T,E] = q.drop( (no-1)*pageSize ).take(pageSize) } suppliers.page(5) coffees.sortBy(_.name).page(5) Query extensions
    [Show full text]
  • HHS Logical Data Model Practices Guide
    DEPARTMENT OF HEALTH AND HUMAN SERVICES ENTERPRISE PERFORMANCE LIFE CYCLE FRAMEWORK <OIDV Logo> PPRRAACCTTIIICCEESS GGUUIIIDDEE LOGICAL DATA MODELING Issue Date: <mm/dd/yyyy> Revision Date: <mm/dd/yyyy> Document Purpose This Practices Guide is provides an overview describing the best practices, activities, attributes, and related templates, tools, information, and key terminology of industry-leading project management practices and their accompanying project management templates. Background The Department of Health and Human Services (HHS) Enterprise Performance Life Cycle (EPLC) is a framework to enhance Information Technology (IT) governance through rigorous application of sound investment and project management principles, and industry best practices. The EPLC provides the context for the governance process and describes interdependencies between its project management, investment management, and capital planning components. The EPLC framework establishes an environment in which HHS IT investments and projects consistently achieve successful outcomes that align with Department and Operating Division goals and objectives. A Logical Data Model (LDM) is a complete representation of data requirements and the structural business rules that govern data quality in support of project’s requirements. The LDM shows the specific entities, attributes and relationships involved in a business function and is the basis for the creation of the physical data model. Practice Overview A logical data model is a graphical representation of the business requirements. They describe the data that are important to the organization and how they relate to one another, as well as business definitions and examples. The logical data model is created during the Requirements Analysis Phase and is a component of the Requirements Document. The logical data model helps to build a common understanding of business data elements and requirements and is the basis of physical database design.
    [Show full text]