HIERARCHICAL and SEMANTIC DATA MANAGEMENT and QUERYING for PATIENT RECORDS and PERSONAL PHOTOS by BRENDAN DAVID ELLIOTT Submit

Total Page:16

File Type:pdf, Size:1020Kb

HIERARCHICAL and SEMANTIC DATA MANAGEMENT and QUERYING for PATIENT RECORDS and PERSONAL PHOTOS by BRENDAN DAVID ELLIOTT Submit HIERARCHICAL AND SEMANTIC DATA MANAGEMENT AND QUERYING FOR PATIENT RECORDS AND PERSONAL PHOTOS By BRENDAN DAVID ELLIOTT Submitted in partial fulfillment of the requirements For the degree of Doctor of Philosophy Dissertation Adviser: Dr. Z. Meral Özsoyoğlu Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY January, 2009 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Brendan David Elliott candidate for the Doctor of Philosophy degree *. (signed)Z. Meral Özsoyoğlu (chair of the committee) Daniela Calvetti H. Andy Podgurski Guo-Qiang Zhang Gultekin Özsoyoğlu (date) September 19, 2008 *We also certify that written approval has been obtained for any proprietary material contained therein. 1 Table of Contents List of Tables ................................................................................................................................................... 6 List of Figures ................................................................................................................................................. 7 Acknowledgements ....................................................................................................................................... 10 Abstract ......................................................................................................................................................... 12 1 Introduction ............................................................................................................................................... 14 1.1 Part I: Pedigree Data Management (Chapters 2–6) ......................................................................... 15 1.2 Part II: Semantic Personal Photo Management (Chapters 7-9) ....................................................... 16 1.3 Part III: Semantic Query Processing (Chapters 10-12) ................................................................... 18 Part I: Pedigree Data Management ................................................................................................................ 22 2 An Overview of Pedigree Data Management ............................................................................................ 22 2.1 Pedigrees ......................................................................................................................................... 23 2.2 Pedigree Querying........................................................................................................................... 25 2.3 Previous Work ................................................................................................................................ 26 2.4 Pedigree Modeling .......................................................................................................................... 29 2.5 Discussion ....................................................................................................................................... 30 3 PQL: A Language for Querying Pedigree Data ......................................................................................... 31 3.1 Query starting steps ......................................................................................................................... 31 3.2 Basic axis steps: .............................................................................................................................. 31 3.3 Gendered steps ................................................................................................................................ 32 3.4 Simple Attributes ............................................................................................................................ 33 3.5 Conditional Steps—predicates ........................................................................................................ 33 3.6 Set expression steps ........................................................................................................................ 34 3.7 User-defined (macro) steps ............................................................................................................. 35 3.8 Aggregate Functions ....................................................................................................................... 36 3.9 Combined use with XPath ............................................................................................................... 36 3.10 More Examples ............................................................................................................................... 37 3.11 Discussion ....................................................................................................................................... 38 4 Evaluation of PQL Queries using NodeCodes ........................................................................................... 39 4.1 Labeling for PQL: NodeCodes for Pedigree graphs ....................................................................... 39 4.1.1 Representing Paths with NodeCodes ....................................................................................... 41 4.1.2 Query Evaluation with NodeCodes .......................................................................................... 43 4.2 NodeCode Updates ......................................................................................................................... 46 4.2.1 Child insertion .......................................................................................................................... 47 4.2.2 Progenitor insertion .................................................................................................................. 47 4.2.3 Missing link insertion .............................................................................................................. 48 4.2.4 Merging pedigrees ................................................................................................................... 49 4.3 Synthetic Data Generation .............................................................................................................. 49 4.4 Experimental Results ...................................................................................................................... 50 4.4.1 Experimental data: ................................................................................................................... 51 4.4.2 Experiments on Real Data ....................................................................................................... 52 4.4.3 Experiments on Synthetic Data ................................................................................................ 55 4.5 Discussion ....................................................................................................................................... 58 5 Efficient Evaluation of Inbreeding Queries ............................................................................................... 59 5.1 Inbreeding ....................................................................................................................................... 59 5.2 Review of Pedigree Graph Structure and NodeCodes .................................................................... 60 5.3 Inbreeding Calculations .................................................................................................................. 62 5.4 Inbreeding Coefficient .................................................................................................................... 63 5.5 Calculating Inbreeding Coefficient with NodeCodes...................................................................... 64 5.5.1 Identifying Common Ancestors ............................................................................................... 65 5.5.2 Identifying pairs of paths from common ancestors .................................................................. 67 5.5.3 Identifying Overlapping Pairs of Paths .................................................................................... 68 2 5.5.4 Complexity of Algorithm: ........................................................................................................ 70 5.6 Experiments .................................................................................................................................... 71 5.6.1 Experimental Data ................................................................................................................... 71 5.6.2 Experimental Setup .................................................................................................................. 72 5.6.3 Experimental Results ............................................................................................................... 73 5.7 Discussion ....................................................................................................................................... 80 6 Family NodeCodes for Inbreeding Queries ............................................................................................... 81 6.1 Family-level Graph ......................................................................................................................... 81 6.1.1 Family-level Graph Structure .................................................................................................. 82 6.1.2 Scalability of Family-level Pedigree Graphs ........................................................................... 84 6.2 Family-level
Recommended publications
  • A High-Quality Digital Library Supporting Computing Education: the Ensemble Approach
    A High-quality Digital Library Supporting Computing Education: The Ensemble Approach Yinlin Chen Dissertation submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science & Application Edward A. Fox, Chair Sanmay Das Weiguo Fan Christopher L. North Ricardo Da Silva Torres July 27, 2017 Blacksburg, Virginia 24061 Keywords: Educational Digital Library; ACM Classification System; Amazon Mechanical Turk; Classification; Transfer learning; Active learning; YouTube; SlideShare; Digital Library Service Quality; Cloud Computing Copyright 2017 Yinlin Chen A High-quality Digital Library Supporting Computing Education: The Ensemble Approach Yinlin Chen ABSTRACT Educational Digital Libraries (DLs) are complex information systems which are designed to support individuals' information needs and information seeking behavior. To have a broad impact on the communities in education and to serve for a long period, DLs need to structure and organize the resources in a way that facilitates the dissemination and the reuse of resources. Such a digital library should meet defined quality dimensions in the 5S (Societies, Scenarios, Spaces, Structures, Streams) framework - including completeness, consistency, efficiency, extensibility, and reliability - to ensure that a good quality DL is built. In this research, we addressed both external and internal quality aspects of DLs. For internal qualities, we focused on completeness and consistency of the collection, catalog, and repos- itory. We developed an application pipeline to acquire user-generated computing-related resources from YouTube and SlideShare for an educational DL. We applied machine learn- ing techniques to transfer what we learned from the ACM Digital Library dataset.
    [Show full text]
  • Timeline of the European Commission, U.S
    Google’s Anti-Competitive Search Manipulation Practices A Timeline of the European Commission, U.S. FTC, DOJ, and State AG Antitrust Probes, Prohibition Decisions & Lawsuits By Adam Raff and Shivaun Raff Co-founders of Foundem and SearchNeutrality.org First Published: 15 April 2015 Last Updated: 3 March 2021 Table of Contents Jun 2006 Foundem’s Google Search Penalty 3 May 2007 Google Launches Universal Search 6 Nov 2009 Foundem’s Competition Complaint to the European Commission (EC) 14 Feb 2010 Launch of the EC’s Informal Investigation / Foundem’s Updated Complaint 16 Nov 2010 Launch of the EC’s Formal Investigation 19 Apr 2011 Official Launch of the U.S. FTC’s Investigation 22 May 2012 Commissioner Almunia Commences Settlement Negotiations with Google 27 Jan 2013 Premature Closure of the U.S. FTC’s Investigation 34 Jan 2013 Origins of the Even-Handed / Equal-Treatment Remedy 35 Apr 2013 Google’s 1st Commitment Proposals 38 Oct 2013 Google’s 2nd Commitment Proposals 41 Jan 2014 Google’s 3rd Commitment Proposals 43 Jul 2014 The European Commission’s Dramatic U-Turn 50 Nov 2014 Commissioner Vestager Takes Office 53 Mar 2015 Inadvertent Release of the FTC’s Damning Internal Final Report 54 Apr 2015 Statement of Objections (SO) in the Google Search Case 57 Jul 2016 Supplementary Statement of Objections (SSO) in the Google Search Case 62 Jun 2017 EC Prohibition Decision (Guilty Verdict) in the Google Search Case 65 Sep 2017 Google’s Brazenly Non-Compliant CSS Auction 67 Jul 2018 EC Prohibition Decision in the Google Android Case 69 Jun 2019 Launch of the U.S.
    [Show full text]
  • Speaker Book
    Table of Contents Program 5 International Guests 9 Speakers 15 NOAH Infographic 97 2 3 The NOAH Bible, an up-to-date valuation and industry KPI publication. This is the most comprehensive set of valuation comps you'll find in the industry. Reach out to us if you spot any companies or deals we've missed! December 2017 Edition (PDF) Sign up Here 4 Program 5 MAIN STAGE - Day 1 13 March 2018 COMPANY TIME COMPANY / SESSION TITLE SPEAKER POSITION 9:00 - 10:00 Breakfast 9:30 - 9:35 Welcome Note K ® NOAH Advisors Marco Rodzynek Founder & CEO MR FC 9:35 - 9:45 AppCard Yair Goldfinger Co-Founder & CEO 9:45 - 9:55 Venture Capital Enviroment in Europe, Israel and US K Vintage Investment Partners Abe Finkelstein General Partner 9:55 - 10:05 Colu Amos Meiri CEO 10:05 - 10:15 Bancor Eyal Hertzog Co-Founder CP 10:15 - 10:25 Orbs Tal Kol Co-Founder Crypto Currencies 10:25 - 10:35 Barrel Protocol Jonathan Meiri Founder & CEO K 10:35 - 10:45 Viola Ronen Nir General Partner CP 10:45 - 10:55 Behalf Benjy Feinberg CEO 10:55 - 11:05 Payoneer Keren Levy MR COO FC 11:05 - 11:15 Zeek Daniel Zelkind Co-Founder & CEO 11:15 - 11:25 Simplex Nimrod Lehavi Co-Founder & CEO CP 11:25 - 11:35 eToro Yoni Assia Founder & CEO 11:35 - 11:45 Investing.com Dror Efrat MR Founder & CEO FC Fintech 11:45 - 11:55 SafeCharge Nicolas Vedrenne CBDO CP 11:55 - 12:05 Lemonade Shai Wininger MR Co-Founder 12:05 - 12:15 VATBox Isaac Saft Founder & CEO FC Target Global Shmuel Chafets General Partner K 12:15 - 12:25 Jfrog Shlomi Ben Haim Co-Founder & CEO K 12:25 - 12:35 Pitango Rami Beracha
    [Show full text]
  • CPM's 20Th Anniversary: a Statistical Retrospective
    CPM’s 20th Anniversary: A Statistical Retrospective Elena Yavorska Harris1, Thierry Lecroq2, Gregory Kucherov3, and Stefano Lonardi1 1 Dept. of Computer Science – University of California – Riverside, CA, USA 2 University of Rouen, LITIS EA 4108, 76821 Mont-Saint-Aignan Cedex, France 3 CNRS (LIFL, Lille and J.-V.Poncelet Lab, Moscow) and INRIA Lille – Nord Europe 1 Introduction This year the Annual Symposium on Combinatorial Pattern Matching (CPM) celebrates its 20th anniversary. Over the last two decades the Symposium has established itself as the most recognized international forum for research in combinatorial pattern match- ing and related applications. Contributions to the conference typically address issues of searching and matching strings and more complex patterns such as trees, regular ex- pressions, graphs, point sets, and arrays. Advances in this field rely on the ability to expose combinatorial properties of the computational problem at hand and to exploit these properties in order to either achieve superior performance or identify conditions under which searches cannot be performed efficiently. The meeting also deals with com- binatorial problems in computational biology, data compression, data mining, coding, information retrieval, natural language processing and pattern recognition. The first edition of CPM was held in Paris in July 1990, and gathered about thirty participants. Since then the conference has been held every year, usually in June or July. Thirteen countries, over three continents, have hosted it (see Table 1). The “seed” of CPM can be traced back to a NATO-ASI Workshop in Maratea, Italy organized by Z. Galil and A. Apostolico. The volume collecting the contributions presented at the workshop [1] defined perhaps for the first time the scope of this research area, sometimes referred to as “stringology”.
    [Show full text]
  • Network Neutrality and the False Promise of Zero-Price Regulation
    NETWORK NEUTRALITY AND THE FALSE PROMISE OF ZERO-PRICE REGULATION C. Scott Hemphill t This Article examines zero-price regulation, the major distinguishing feature of many modern "network neutrality" proposals. A zero-price rule prohibits a broadbandInternet access providerfrom charging an application or content provider (collectively, "content provider") to send information to consumers. The Article differentiates two access provider strategies thought to justify a zero-price rule. Exclusion is anticompetitive behavior that harms a content provider to favor its rival. Extraction is a toll imposed upon content providers to raise revenue. Neither strategy raises policy concerns that justify implementation of a broad zero-price rule. First, there is no economic exclusion argument that justifies the zero-price rule as a general matter, given existing legal protections against exclusion. A stronger but narrow argument for regulation exists in certain cases in which the output of social producers, such as Wikipedia, competes with ordinary market-produced content. Second, prohibiting direct extraction is undesirable and counterproductive, in part because it induces costly and unregulated indirect extraction. I conclude, therefore, that recent calls for broad-basedzero-price regulation are mistaken. f Associate Professor and Milton Handler Fellow, Columbia Law School. Larry Darby, Brett Frischmann, Victor Goldberg, Harvey Goldschmid, Jeffrey Gordon, Robert Hahn, Michael Heller, Bert Huang, Kenneth Katkin, William Kovacic, Ilyana Kuziemko, Mark Lemley, Christopher Leslie, Lawrence Lessig, Lance Liebman, Edward Morrison, Richard Posner, Alex Raskolnikov, Robert Scott, Philip Weiser, Mark Wu, and Tim Wu, and participants in conferences at the University of Colorado (Boulder), the Max Planck Institute for Research on Collective Goods, and Michigan State University, provided helpful discussion and comments on previous drafts.
    [Show full text]
  • The Impact of Lemelson-MIT Prize Winners' Inventions
    C O R P O R A T I O N BENJAMIN M. MILLER, DAVID METZ, JON SCHMID, PAIGE M. RUDIN, MARJORY S. BLUMENTHAL Measuring the Value of Invention The Impact of Lemelson-MIT Prize Winners’ Inventions For more information on this publication, visit www.rand.org/t/RRA838-1 Library of Congress Cataloging-in-Publication Data is available for this publication. ISBN: 978-1-9774-0654-5 Published by the RAND Corporation, Santa Monica, Calif. © Copyright 2021 RAND Corporation R® is a registered trademark. Cover photos: The Lemelson-MIT Program Limited Print and Electronic Distribution Rights This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited. Permission is given to duplicate this document for personal use only, as long as it is unaltered and complete. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial use. For information on reprint and linking permissions, please visit www.rand.org/pubs/permissions. The RAND Corporation is a research organization that develops solutions to public policy challenges to help make communities throughout the world safer and more secure, healthier and more prosperous. RAND is nonprofit, nonpartisan, and committed to the public interest. RAND’s publications do not necessarily reflect the opinions of its research clients and sponsors. Support RAND Make a tax-deductible charitable contribution at www.rand.org/giving/contribute www.rand.org Preface For the past 25 years, the Lemelson–Massachusetts Institute of Technology (MIT) Program has given an annual $500,000 prize to a mid-career inventor whose work offers a significant value to society, has improved lives and communities, and has been adopted or has a high probability of being adopted for practical use.
    [Show full text]
  • Programming Technologies for Engineering Quality Multicore
    Programming Technologies for Engineering Quality Multicore Software by Tim Kaler B.S., Massachusetts Institute of Technology (2012) M.Eng, Massachusetts Institute of Technology (2013) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2020 c Massachusetts Institute of Technology 2020. All rights reserved. Author............................................................................ Department of Electrical Engineering and Computer Science August 28, 2020 Certified by........................................................................ Charles E. Leiserson Professor of Electrical Engineering and Computer Science Thesis Supervisor Accepted by....................................................................... Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students 2 Programming Technologies for Engineering Quality Multicore Software by Tim Kaler Submitted to the Department of Electrical Engineering and Computer Science on August 28, 2020, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Abstract The widespread availability of large multicore computers in the cloud has given engineers and scientists unprecedented access to large computing platforms. Traditionally, high-end computing solutions have been developed and used by only
    [Show full text]