Graph Processing: a Panoramic View and Some Open Problems
Total Page:16
File Type:pdf, Size:1020Kb
Graph Processing: A Panoramic View and Some Open Problems M. Tamer Ozsu¨ University of Waterloo David R. Cheriton School of Computer Science https://cs.uwaterloo.ca/~tozsu © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)1 Graph Theory Graph Graph Algorithms Systems Graph Research is Dispersed Knowledge graphs Semantic Web Graph Graph Databases Analytics © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)2 Knowledge graphs Semantic Web Graph Graph Databases Analytics Graph Research is Dispersed Graph Theory Graph Graph Algorithms Systems © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)2 Knowledge graphs Graph Theory Semantic Web Graph Graph AlgorithmsDatabases AnalyticsSystems Graph Research is Dispersed Database Social Data Mining Computing AI/ML © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)2 Knowledge graphs Graph Theory Semantic Web Graph Graph AlgorithmsDatabases AnalyticsSystems Graph Research is Dispersed Knowledge graphs Semantic Web Graph ? Graph Databases Analytics Database Graph Theory Social Data Mining Computing Graph Graph Algorithms Systems AI/ML © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)2 RDF graph Property graph My objectives Three things... 1 Discuss a way to coherently position work in the various communities; 2 A tour across different communities to provide a panoramic view of the research; 3 Highlight some problems that interest me! Knowledge graphs Semantic web Graph Graph Analytics DBMSs Systems © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)3 My objectives Three things... 1 Discuss a way to coherently position work in the various communities; 2 A tour across different communities to provide a panoramic view of the research; 3 Highlight some problems that interest me! RDF graph Knowledge graphs Semantic web Graph Graph Analytics DBMSs Systems Property graph © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)3 Freud's Recommendation for a Good Talk... By Way of Moshe Vardi... © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)4 Introduction RDF Engines Graph DBMSs Graph Analytics Systems Dynamic & Streaming Graphs Concluding Remarks © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)5 In the beginning... Aircraft Engine Right engine ... there was IMS Left engine Body By IBM (along with Fuselage FWD Rockwell & Caterpillar) Section 4 Section 43 For the Apollo program Fuselage AFT First deployed in 1968 Section 46 Empennage Managing Bill of Materials ... (BOM) of Saturn rocket Wings Hierarchical model because Left Wing LW Leading Edge BOM is hierarchical RW Leading Edge Wing Box ... Right Wing ... © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)6 In the beginning... ... and IDS By GE To control their manufacturing processes First deployed in 1964 Manufacturing processes (with scheduling constraints) form a graph Network model Led to CODASYL standard Source: https://dba.stackexchange.com/questions/ 119380/er-vs-database-schema-diagrams © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)6 D´ej`avu all over again? Network models were also used in I Object DBMSs I XML Network (CODASYL) Data Model • 4. The arrow is directed to the member of the corresponding set type. Consider a more complicated example. Suppose a company produces and sells computers. In this case all record types are evident: • 1. Computer products that the company sells (PRODUCT); • 2. Customers who buy the products (CUSTOMER); • 3.Network Representatives who sell the DBMSsproducts (REPRESENT); Network (CODASYL) Data Model • 4. Sales transactions (TRANSACTION); All set types are also evident: The Data Sets might look as follows: • 1. Product and all transactions which include this company's product (PT); • 1. Products and all transactions which include these company's products (PT); • 2. Customer and all transactionsCODASYL which were done by Languagethis customer (CT); • 3. Representative and all transactions which were done by this representative (RT); • 2. Customer and all transactions which were done by these customers(CT); A current state of the database miIght lookFIND as follows: The withRecords, for keyexample, might be: • 3. Representatives and all transactions which were done by these representatives(RT). • 1. Computer products that the company sells (PRODUCT); 2.3 Data Updating Facilities • 2. Customers who buy theI productsNavigate (CUSTOMER); within the set, within elements of the same We have discussed only the first part of Network Data Model - the data description facilities. • 3. Representatives who sell the products (REPRESENT); record type, etc Each data model also includes particular data manipulation facilities - Data Manipulation • 4. Sales transactions (TRANSACTION); Language (DML). The data manipulation facilities of a concrete DML can be also divided into two parts: • (i) data update functions; • (ii) data retrieve functions. The main update functions of a Network data manipulation language are: • 1. To store new occurrences of the record type declared in the current data base schema. • 2. To modify existing occurrences of the record type declared in the current data base schema. • 3. To delete existing occurrences of the record type declared in the current data base schema. Source: Network (CODASYL) Data Model, https://coronet.iicm.tugraz.at/is/scripts/lesson03.pdf• 4. To insert existing occurrences of the record type declared in the current data base schema as a member of a certain data set into the exactly one occurrence of this data set. © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)7 • 5.To remove existing occurrence of the member of the data set from the occurrence of this data set. Putting a new record occurrence into a database: D´ej`avu all over again? Network (CODASYL) Data Model • 4. The arrow is directed to the member of the corresponding set type. Consider a more complicated example. Suppose a company produces and sells computers. In this case all record types are evident: • 1. Computer products that the company sells (PRODUCT); • 2. Customers who buy the products (CUSTOMER); • 3.Network Representatives who sell the DBMSsproducts (REPRESENT); Network (CODASYL) Data Model • 4. Sales transactions (TRANSACTION); All set types are also evident: The Data Sets might look as follows: • 1. Product and all transactions which include this company's product (PT); • 1. Products and all transactions which include these company's products (PT); • 2. Customer and all transactionsCODASYL which were done by Languagethis customer (CT); • 3. Representative and all transactions which were done by this representative (RT); • 2. Customer and all transactions which were done by these customers(CT); A current state of the database miIght lookFIND as follows: The withRecords, for keyexample, might be: • 3. Representatives and all transactions which were done by these representatives(RT). • 1. Computer products that the company sells (PRODUCT); 2.3 Data Updating Facilities • 2. Customers who buy theI productsNavigate (CUSTOMER); within the set, within elements of the same We have discussed only the first part of Network Data Model - the data description facilities. • 3. Representatives who sell the products (REPRESENT); record type, etc Each data model also includes particular data manipulation facilities - Data Manipulation • 4. Sales transactions (TRANSACTION); Language (DML). The data manipulation facilities of a concrete DML can be also divided into two parts: • (i) data update functions; Network models were also• (ii) used data retrieve in functions. The main update functions of a Network data manipulation language are: I Object DBMSs • 1. To store new occurrences of the record type declared in the current data base schema. • 2. To modify existing occurrences of the record type declared in the current data base I XML schema. • 3. To delete existing occurrences of the record type declared in the current data base schema. Source: Network (CODASYL) Data Model, https://coronet.iicm.tugraz.at/is/scripts/lesson03.pdf• 4. To insert existing occurrences of the record type declared in the current data base schema as a member of a certain data set into the exactly one occurrence of this data set. © M. Tamer Ozsu¨ VLDB 2019 (2019/08/27)7 • 5.To remove existing occurrence of the member of the data set from the occurrence of this data set. Putting a new record occurrence into a database: Network (CODASYL) Data Model • 4. The arrow is directed to the member of the corresponding set type. Consider a more complicated example. Suppose a company produces and sells computers. In this case all record types are evident: • 1. Computer products that the company sells (PRODUCT); • 2. Customers who buy the products (CUSTOMER); • 3.Network Representatives who sell the DBMSsproducts (REPRESENT); Network (CODASYL) Data Model • 4. Sales transactions (TRANSACTION); All set types are also evident: The Data Sets might look as follows: • 1. Product and all transactions which include this company's product (PT); • 1. Products and all transactions which include these company's products (PT); • 2. Customer and all transactionsCODASYL which were done by Languagethis customer (CT); • 3. Representative and all transactions which were done by this representative (RT); • 2. Customer and all transactions which were done by these customers(CT); A current state of the database miIght lookFIND as follows: The withRecords, for keyexample, might be: • 3. Representatives and all transactions which were done by these representatives(RT). • 1. Computer products that the company sells (PRODUCT); 2.3 Data Updating Facilities • 2. Customers who buy theI productsNavigateD´ej`avu (CUSTOMER); within the all set, over within elements again? of the same We have discussed only the first part