Paper 928 Complex Group-By Queries for XML

Total Page:16

File Type:pdf, Size:1020Kb

Paper 928 Complex Group-By Queries for XML Paper 928 Complex Group-By Queries for XML C. Gokhale+, N. Gupta+, P. Kumar+, L.V.S. Lakshmanan∗, R. Ng∗, and B.A. Prakash+ + Indian Institute of Technology, Bombay ∗ University of British Columbia, Canada 1 Introduction also omit the selection part of the query, and just focus on the aggregation part. The popularity of XML as a data exchange standard has group //Book led to the emergenceof powerfulXML query languages like by //Name return ( //Name, avg(/Price), count(*) XQuery [21] and studies on XML query optimization. Of then by /Year return ( late, there is considerable interest in analytical processing /Year, median(/#Sold) ) ) of XML data (e.g.,[2, 3]). As pointed out by Borkar and The intent of the query is to group Book nodes first Carey in [3], even for data integration, there is a compelling by (publisher) Name. For each group, we get the average need for performing various group-by style aggregate oper- Price, andthe numberof Book nodesin the group. More- ations. A core operator needed for analytics is the group- over, for each publisher group, the book nodes are further by operator, which is widely used in relational as well as sub-grouped by Year. For each of these nested groups, the OLAP database applications. XQuery requires group-by median #Sold per year is returned. operations to be simulated using nesting [2]. Studies addressing the need for XML grouping fall into Q1−Answer two broad categories: (1) Provide support for grouping at Name−group Name−group the logical or physical level [6] and recognize grouping op- erations from nested queries and rewrite them with group- Name avg count() Year−group Year−group Kaufman (Price) 258 Name avg count() Year−group ing operations [4, 5, 9, 12]. (2) Extend XQuery FLWOR $70 Wesley (Price) 175 Year median Year median $120 expressions with explicit constructs similar to the group-by, 1999 (#Sold) 2000 (#Sold) Year median 5600 26000 1999 (#Sold) order-by and having clauses in SQL [3, 2]. However, direct 2300 algorithmic support for a group-by operator is not explored. In this paper, we focuson efficient processing of a group- Figure 2. Partial Result of Q1 by operator for XML – with the additional goal of sup- porting a full spectrum of aggregation operations, including Figure 2 shows the answer to query Q1 for the (par- holistic ones such as median() [8] and complex nested tial) data in Figure 1. The first group shown, for instance, Name = Kaufman aggregations, together with having clause, as well as mov- is for . Among the 258 books in this ing window aggregation. group, the average price is $70. These books are further sub-grouped by Year. For each year that appears in the in- Consider the simple catalogue example in Figure 1. This put data, the median number of copies sold is also returned can be part of an input XML database, or intermediate re- (e.g., 5600 for 1999). We can enhance nested group-by sult of a query. The catalogue is heterogeneous: it contains query Q with two features, as illustrated by query Q : information about books, music CDs, etc. Books are orga- 1 2 group //Book nized by Subject, e.g., physics, chemistry. For by //Name each book, there is information on its Title, Author, having count(*) ≥ 100 return ( //Name, avg(/Price), count(*) Year, #Sold, Price, (publisher) Name, etc. Books may then by /Year return ( have multiple authors. The data value at a leaf node is /Year(10,5), median(/#Sold) ) ) shown in italics. The node id of a node is also shown for In Q2, the having-clause for the outer block removes future discussion. publishers with the total number of book nodes less than Consider the following nested group-by query Q1. 100. Besides, we form moving windows over years – with While we could follow the syntax proposed by [2], syntax each window having a width of 10 years and a step size not being our main focus, we use a more concise form. We of 5 years (e.g., [1990,2000], [1995,2005], etc.). While in 1 Catalogue Subject Subject (22) Music CD (1) (23) (24) (33) (13) Name Name (2) Book (3) Book Book Book Physics Chemistry (5) (6) (7) (8) (9) (10) (14) (15) (16) (17) (18) (19) (25) (26) (27) (28) (29) (30) (4)PubInfo TitleAuthor Author Year #Sold Price PubInfo Title Author Year #Sold Price PubInfo Title Author Year #Sold Price Sound Mcmanus 2000 800 $96 Carbon Johnson 1999 8000 $25 MechnicsNewton Smith 1999 5600 $60 Compounds (11) Waves (20) (31) (12) (21) Name City Name City Name City (32) Kaufman NY Wesley LA Kaufman NY Figure 1. The Catalogue Example the next section we will present a more comprehensive set performance than simulating grouping via nesting. None of moving window options, it should be easy to appreci- of these papers discuss algorithms for directly computing ate the value of supporting nested group-bys with having group-bys (with possible nesting, having, and moving win- clauses and moving windows for XML querying. In princi- dows). ple, all value aggregationsrequired of XML can be obtained A second line of studies investigates how to support by shredding it to relations and using SQL (the “SQL ap- group-by at a logical or physical level [6], and detect proach”). We examine this issue empirically in Section 7 group-bys from nested queries and rewrite them with ex- with an emphasis on queries involving grouping together plicit grouping operations [4, 5, 9, 12]. However, detect- with nesting. Indeed, owing to XML’s inherent hierarchical ing grouping inherent in nested queries is challenging and nature, nested group-by (e.g., query Q1) is a fundamental such queries are hard to express and understand. In partic- type of group-by that merits study. In our experiments, we ular, the focus of [12] is on structural aggregation by node observed an order of magnitude difference between the per- types as opposed to value aggregation. Studies by Fiebig formance of the SQL approach (using Oracle) and ours. We and Moerkotte [6], Pedersen et al. [13], and Deutsch et make the following contributions. al. [4] all consider using query optimization-style rewrite • We propose a framework for expressing complex ag- rules for various kinds of grouping. The transformed query gregationqueries on XML data featuring nested group- plan would be based on nested loops. by, having cluase, and moving windows (Section 3). There is an extensive body of work on efficient com- • We develop a disk-based algorithm for efficient eval- putation of group-by and cube queries for relational data uation of queries involving any subset of the above (e.g., [8, 10]). These algorithms are not directly applica- fearures (Section 5). ble to hierarchical data especially when group-by elements • We discuss the results of a comprehensiveset of exper- (β’s) may involve combination of forward and backward iments comparing our approach with that of shredding axes and aggregations on values may be nested and may XML into relations and using SQL, and with those of occur at multiple levels (e.g., Q2). Of course, by shred- Galax [7] and Qizx [17], validating the efficiency of ding XML to relations, all such queries can be expressed in our algorithm and the effectiveness of the optimiza- SQL. The performance impact of this approach compared tions developed (Section 7). with our direct approach is discussed in Section 7. Related work appears in the next section. Section 8 sum- Finally, [15, 19] study XPath selectivity estimation to marizes the paper and discusses future work. obtain statistical summaries and approximate answers for XPath expressions. They do not directly support exact com- 2 Related Work putation of group-bys. While for relational data, SQL provides explicit support 3 Class of Nested Group-bys for group-by, XQuery requires us to simulate it using nest- ing. It has been noted that this leads to expressions that are 3.1 GeneralFormof1-levelNestingandExamples hard to read, write, and process efficiently [2, 3]. Beyer et The examples discussed so far are instances of the gen- al. [2] and Borkar and Carey [3] propose syntactic exten- sions to XQuery FLOWR expressions to provide explicit eral form of a one-level nested group-by query below. group α support for group-by. They also demonstrate how related where Cons out out out out analytics such as moving window aggregations and cube by β1 (mw1 ) ...βk (mwk ) having AggConsout return ( can also be expressed in the extended syntax. Beyer et out out out out out out β1 ,...,βk , agg1 (γ1 ),...,aggm (γm ) in in in in al. report preliminary experimental results indicating better then by β1 (mw1 ),...,βp (mwp ) 2 Catalogue any book received in a group. As a last example of nested Publisher Publisher aggregation, spread(Rating)=def max(Rating) − Name Location Location Penguin min(Rating) combines min() and max(). out Name Year Year Aggregation Conditions: Cons, AggCons and New York in Value Book Book AggCons are sets of conditions. Cons in the 1999 where clause are the usual node-level selection conditions. Subject Title #Sold Review Review Price Science Fiction 200,000 $15 out in Foundation AggCons and AggCons are sets of aggregation Reviewer Rating John Doe 9 conditions of the form aggi(γi) θi ci, where θi ∈ {=, 6= ,>,<, ≥, leq}, and ci is a constant. With the use of the Figure 3. An Example Illustrating Node Type Inversion having clause, iceberg queries can be easily expressed in the proposed framework. Moving windows: Moving window queries have been having AggConsin return ( in in in in in in studied extensively for relational databases [18].
Recommended publications
  • Aggregate Order by Clause
    Aggregate Order By Clause Dialectal Bud elucidated Tuesdays. Nealy vulgarizes his jockos resell unplausibly or instantly after Clarke hurrah and court-martial stalwartly, stanchable and jellied. Invertebrate and cannabic Benji often minstrels some relator some or reactivates needfully. The default order is ascending. Have exactly match this assigned stream aggregate functions, not work around with security software development platform on a calculation. We use cookies to ensure that we give you the best experience on our website. Result output occurs within the minimum time interval of timer resolution. It is not counted using a table as i have group as a query is faster count of getting your browser. Let us explore it further in the next section. If red is enabled, when business volume where data the sort reaches the specified number of bytes, the collected data is sorted and dumped into these temporary file. Kris has written hundreds of blog articles and many online courses. Divides the result set clock the complain of groups specified as an argument to the function. Threat and leaves only return data by order specified, a human agents. However, this method may not scale useful in situations where thousands of concurrent transactions are initiating updates to derive same data table. Did you need not performed using? If blue could step me first what is vexing you, anyone can try to explain it part. It returns all employees to database, and produces no statements require complex string manipulation and. Solve all tasks to sort to happen next lesson. Execute every following query access GROUP BY union to calculate these values.
    [Show full text]
  • SQL from Wikipedia, the Free Encyclopedia Jump To: Navigation
    SQL From Wikipedia, the free encyclopedia Jump to: navigation, search This article is about the database language. For the airport with IATA code SQL, see San Carlos Airport. SQL Paradigm Multi-paradigm Appeared in 1974 Designed by Donald D. Chamberlin Raymond F. Boyce Developer IBM Stable release SQL:2008 (2008) Typing discipline Static, strong Major implementations Many Dialects SQL-86, SQL-89, SQL-92, SQL:1999, SQL:2003, SQL:2008 Influenced by Datalog Influenced Agena, CQL, LINQ, Windows PowerShell OS Cross-platform SQL (officially pronounced /ˌɛskjuːˈɛl/ like "S-Q-L" but is often pronounced / ˈsiːkwəl/ like "Sequel"),[1] often referred to as Structured Query Language,[2] [3] is a database computer language designed for managing data in relational database management systems (RDBMS), and originally based upon relational algebra. Its scope includes data insert, query, update and delete, schema creation and modification, and data access control. SQL was one of the first languages for Edgar F. Codd's relational model in his influential 1970 paper, "A Relational Model of Data for Large Shared Data Banks"[4] and became the most widely used language for relational databases.[2][5] Contents [hide] * 1 History * 2 Language elements o 2.1 Queries + 2.1.1 Null and three-valued logic (3VL) o 2.2 Data manipulation o 2.3 Transaction controls o 2.4 Data definition o 2.5 Data types + 2.5.1 Character strings + 2.5.2 Bit strings + 2.5.3 Numbers + 2.5.4 Date and time o 2.6 Data control o 2.7 Procedural extensions * 3 Criticisms of SQL o 3.1 Cross-vendor portability * 4 Standardization o 4.1 Standard structure * 5 Alternatives to SQL * 6 See also * 7 References * 8 External links [edit] History SQL was developed at IBM by Donald D.
    [Show full text]
  • Overview of SQL:2003
    OverviewOverview ofof SQL:2003SQL:2003 Krishna Kulkarni Silicon Valley Laboratory IBM Corporation, San Jose 2003-11-06 1 OutlineOutline ofof thethe talktalk Overview of SQL-2003 New features in SQL/Framework New features in SQL/Foundation New features in SQL/CLI New features in SQL/PSM New features in SQL/MED New features in SQL/OLB New features in SQL/Schemata New features in SQL/JRT Brief overview of SQL/XML 2 SQL:2003SQL:2003 Replacement for the current standard, SQL:1999. FCD Editing completed in January 2003. New International Standard expected by December 2003. Bug fixes and enhancements to all 8 parts of SQL:1999. One new part (SQL/XML). No changes to conformance requirements - Products conforming to Core SQL:1999 should conform automatically to Core SQL:2003. 3 SQL:2003SQL:2003 (contd.)(contd.) Structured as 9 parts: Part 1: SQL/Framework Part 2: SQL/Foundation Part 3: SQL/CLI (Call-Level Interface) Part 4: SQL/PSM (Persistent Stored Modules) Part 9: SQL/MED (Management of External Data) Part 10: SQL/OLB (Object Language Binding) Part 11: SQL/Schemata Part 13: SQL/JRT (Java Routines and Types) Part 14: SQL/XML Parts 5, 6, 7, 8, and 12 do not exist 4 PartPart 1:1: SQL/FrameworkSQL/Framework Structure of the standard and relationship between various parts Common definitions and concepts Conformance requirements statement Updates in SQL:2003/Framework reflect updates in all other parts. 5 PartPart 2:2: SQL/FoundationSQL/Foundation The largest and the most important part Specifies the "core" language SQL:2003/Foundation includes all of SQL:1999/Foundation (with lots of corrections) and plus a number of new features Predefined data types Type constructors DDL (data definition language) for creating, altering, and dropping various persistent objects including tables, views, user-defined types, and SQL-invoked routines.
    [Show full text]
  • Real Federation Database System Leveraging Postgresql FDW
    PGCon 2011 May 19, 2011 Real Federation Database System leveraging PostgreSQL FDW Yotaro Nakayama Technology Research & Innovation Nihon Unisys, Ltd. The PostgreSQL Conference 2011 ▍Motivation for the Federation Database Motivation Data Integration Solution view point ►Data Integration and Information Integration are hot topics in these days. ►Requirement of information integration has increased in recent year. Technical view point ►Study of Distributed Database has long history and the background of related technology of Distributed Database has changed and improved. And now a days, the data moves from local storage to cloud. So, It may be worth rethinking the technology. The PostgreSQL Conference 2011 1 All Rights Reserved,Copyright © 2011 Nihon Unisys, Ltd. ▍Topics 1. Introduction - Federation Database System as Virtual Data Integration Platform 2. Implementation of Federation Database with PostgreSQL FDW ► Foreign Data Wrapper Enhancement ► Federated Query Optimization 3. Use Case of Federation Database Example of Use Case and expansion of FDW 4. Result and Conclusion The PostgreSQL Conference 2011 2 All Rights Reserved,Copyright © 2011 Nihon Unisys, Ltd. ▍Topics 1. Introduction - Federation Database System as Virtual Data Integration Platform 2. Implementation of Federation Database with PostgreSQL FDW ► Foreign Data Wrapper Enhancement ► Federated Query Optimization 3. Use Case of Federation Database Example of Use Case and expansion of FDW 4. Result and Conclusion The PostgreSQL Conference 2011 3 All Rights Reserved,Copyright ©
    [Show full text]
  • SQL Views for Medical Billing
    SQL Views for Cortex Medical Billing SQL Views To execute a view in the SQL Query Analyzer window “Select * from view_name”. To narrow the results of a view you may add a WHERE clause. The column named in the WHERE clause must match the column name in the view in the following order: 1. A column name that follows the view name 2. A column name in the SELECT statement of the view To find out the name of a column in a view, double click on the view name in Enterprise Manager. EXAMPLE: CREATE VIEW ClientPayment_view( ClPaymentServiceNumber,ClPaymentAmount ) AS SELECT ClientPayment.ClientServiceNumber, SUM(ClientPayment.Amount) FROM ClientPayment GROUP BY ClientPayment.ClientServiceNumber In this example, in order to execute a SQL query with a WHERE clause on the ClientPayment_view you must use the column name “ClPaymentServiceNumber” or “ClPaymentAmount”. If the column names (ClPaymentServiceNumber,ClPaymentAmount) did *NOT* follow the view name you could use the column names in the SELECT statement (ClientServiceNumber, Amount). Example SELECT statement with WHERE clause: “ CONFIDENTIAL Page 1 of 4 SQL Views for Cortex Medical Billing SELECT * FROM ClientPayment_view WHERE ClPaymentServiceNumber = '0000045'” When writing a view where you only want one record returned to sum a field like amount (i.e., SUM(Amount)) you cannot return any other columns from the table where the data will differ, like EnteredDate, because the query is forced to return each record separately, it can no longer combine them. View Name Crystal Report Join Returned
    [Show full text]
  • Efficient Processing of Window Functions in Analytical SQL Queries
    Efficient Processing of Window Functions in Analytical SQL Queries Viktor Leis Kan Kundhikanjana Technische Universitat¨ Munchen¨ Technische Universitat¨ Munchen¨ [email protected] [email protected] Alfons Kemper Thomas Neumann Technische Universitat¨ Munchen¨ Technische Universitat¨ Munchen¨ [email protected] [email protected] ABSTRACT select location, time, value, abs(value- (avg(value) over w))/(stddev(value) over w) Window functions, also known as analytic OLAP functions, have from measurement been part of the SQL standard for more than a decade and are now a window w as ( widely-used feature. Window functions allow to elegantly express partition by location many useful query types including time series analysis, ranking, order by time percentiles, moving averages, and cumulative sums. Formulating range between 5 preceding and 5 following) such queries in plain SQL-92 is usually both cumbersome and in- efficient. The query normalizes each measurement by subtracting the aver- Despite being supported by all major database systems, there age and dividing by the standard deviation. Both aggregates are have been few publications that describe how to implement an effi- computed over a window of 5 time units around the time of the cient relational window operator. This work aims at filling this gap measurement and at the same location. Without window functions, by presenting an efficient and general algorithm for the window it is possible to state the query as follows: operator. Our algorithm is optimized for high-performance main- memory database systems and has excellent performance on mod- select location, time, value, abs(value- ern multi-core CPUs. We show how to fully parallelize all phases (select avg(value) of the operator in order to effectively scale for arbitrary input dis- from measurement m2 tributions.
    [Show full text]
  • PROC SQL for PROC SUMMARY Stalwarts Christianna Williams Phd, Chapel Hill, NC
    SESUG Paper 219-2019 PROC SQL for PROC SUMMARY Stalwarts Christianna Williams PhD, Chapel Hill, NC ABSTRACT One of the fascinating features of SAS® is that the software often provides multiple ways to accomplish the same task. A perfect example of this is the aggregation and summarization of data across multiple rows or BY groups of interest. These groupings can be study participants, time periods, geographical areas, or really just about any type of discrete classification that one desires. While many SAS programmers may be accustomed to accomplishing these aggregation tasks with PROC SUMMARY (or equivalently, PROC MEANS), PROC SQL can also do a bang-up job of aggregation – often with less code and fewer steps. This step-by-step paper explains how to use PROC SQL for a variety of summarization and aggregation tasks. It uses a series of concrete, task-oriented examples to do so. For each example, both the PROC SUMMARY method and the PROC SQL method will be presented, along with discussion of pros and cons of each approach. Thus, the reader familiar with either technique can learn a new strategy that may have benefits in certain circumstances. The presentation format will be similar to that used in the author’s previous paper, “PROC SQL for DATA Step Die-Hards”. INTRODUCTION Descriptive analytic and data manipulation tasks often call for aggregating or summarizing data in some way, whether it is calculating means, finding minimum or maximum values or simply counting rows within groupings of interest. Anyone who has been working with SAS for more than about two weeks has probably learned that there are nearly always about a half dozen different methods for accomplishing any type of data manipulation in SAS (and often a score of user group papers on these methods ) and summarizing data is no exception.
    [Show full text]
  • Query Language Reference
    Oracle® Endeca Server Query Language Reference Version 7.4.0 • September 2012 Copyright and disclaimer Copyright © 2003, 2012, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. UNIX is a registered trademark of The Open Group. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency- specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.
    [Show full text]
  • Handling One-To-Many Relations - Grouping and Aggregation
    Handling One-to-Many Relations - Grouping and Aggregation • No class/lab next Tuesday o Today's lab is also Problem Set #1 - due two weeks from today o We won't cover all of today's lab exercises in this lab preparation talk - the rest will be covered this Thursday • Handling One-to-Many Relations o Today's lab prep will review SQL queries from last Thursday's lecture notes o Key concept is use of GROUP BY statements to handle one-to- many relations o We also introduce another database (of URISA proceedings) to learn more about SQL queries and relational database design • SQL queries and GROUP BY expressions * o See the Sample Parcel Database * o See SQL Notes and Oracle help * o See Lab 1 examples o The basic SELECT statement to query one or more tables: SELECT [DISTINCT] column_name1[, column_name2, ...] FROM table_name1[, table_name2, ...] WHERE search_condition1 [AND search_condition2 ...] [OR search_condition3...] [GROUP BY column_names] [ORDER BY column_names]; o Note that the order of the clauses matters! The clauses, if included, must appear in the order shown! Oracle will report an error if you make a mistake, but the error message (e.g., "ORA-00933: SQL command not properly ended") may not be very informative. * Kindly refer to Lecture Notes Section Aggregration: GROUP BY, Group Functions Simple GROUP BY Example From the Parcels Database When a SQL statement contains a GROUP BY clause, the rows are selected using the criteria in the WHERE clause and are then aggregated into groups that share common values for the GROUP BY expressions. The HAVING clause may be used to eliminate groups after the aggregation has occurred.
    [Show full text]
  • Group by Clause in Sql Definition
    Group By Clause In Sql Definition Sedition Dimitry discrowns wilily. Walton azotising overfondly while undesirous Norman foretoken specially or carbonado throughout. Steel-blue and meandrous Konrad swish her humbleness imbrue while Chev forgive some bikini nightmarishly. When you need to group by clause with the method invocation and libraries for. Consider the definition are neural networks better quality, group by clause in sql definition. The definition a join is not used together with the event of the rows with group by clause in sql definition without polluting the parser will eventually return to! Finding the lever without using group join clause in sql server EmpName' is invalid in the slot list because raft is not contained in with an aggregate function or night GROUP BY clause condition must split to reach GROUP join clause. Now we have group by clause in sql definition list of penalties and having clause to all statements or clustered indexes. Sql and partners for any user that we can also on google cloud events that group by clause in sql definition without a grouping. 103 Grouping on Two can More Columns SELECT Statement The. We simply parse or group by clause in sql definition without a developer who makes sure that a word in a dml to use an aggregate functions to low values are inserted. How many players disappear from outside these rows in group by? Multiple tables if you want to create a result per group by clause in sql definition list provided, but not all data management system administration and retrieve and the definition.
    [Show full text]
  • SQL:1999, Formerly Known As SQL3
    SQL:1999, formerly known as SQL3 Andrew Eisenberg Sybase, Concord, MA 01742 [email protected] Jim Melton Sandy, UT 84093 [email protected] Background For several years now, you’ve been hearing and reading about an emerging standard that everybody has been calling SQL3. Intended as a major enhancement of the current second generation SQL standard, commonly called SQL-92 because of the year it was published, SQL3 was originally planned to be issued in about 1996…but things didn’t go as planned. As you may be aware, SQL3 has been characterized as “object-oriented SQL” and is the foundation for several object-relational database management systems (including Oracle’s ORACLE8, Informix’ Universal Server, IBM’s DB2 Universal Database, and Cloudscape’s Cloudscape, among others). This is widely viewed as a “good thing”, but it has had a downside, too: it took nearly 7 years to develop, instead of the planned 3 or 4. As we shall show, SQL:1999 is much more than merely SQL-92 plus object technology. It involves additional features that we consider to fall into SQL’s relational heritage, as well as a total restructuring of the standards documents themselves with an eye towards more effective standards progression in the future. Standards Development Process The two de jure organizations actively involved in SQL standardization, and therefore in the development of SQL: 1999, are ANSI and ISO. More specifically, the international community works through ISO/IEC JTC1 (Joint Technical Committee 1), a committee formed by the International Organization for Standardization in conjunction with the International Electrotechnical Commission.
    [Show full text]
  • A Step Toward SQL/MED Datalink
    A step towards SQL/MED DATALINK Author: Gilles Darold PgConf Asia September 2019 1 SQL/MED ● SQL/MED => SQL Management of External Data ● Define methods to access non-SQL data in SQL ● Standardized by ISO/IEC 9075-9:2001 (completed in SQL:2003 and SQL:2008) SQL/MED specification is divided in two parts 2 1 Part 1 : Foreign Data Wrapper ● Access other data sources represented as SQL tables in PostgreSQL ● Fully implemented in PostgreSQL ● List of FDW : https://wiki.postgresql.org/wiki/Foreign_data_wrappers 3 2 Part 2 : Datalinks ● Reference file that is not part of the SQL environment ● The file is assumed to be managed by some external file manager json ● Column values are references to local or remote files ● There is no PostgreSQL implementation ● A plpgsql/plperlu prototype at https://github.com/lacanoid/datalink 4 3 What are Datalink exactly ? ● Files are referenced through a new DATALINK type. ● A special SQL type intended to store : ● URL in database referencing a external file ● Access control and behavior over external files ● File contents are not stored in the database but on file system 5 4 Datalink representation 6 5 Why are Datalink useful? Applications that need: ● the power of SQL with ACID properties ● and working directly with files. Support huge files, ex: video or audio streaming, without having to be stored in ByteA or Large Object columns. Avoid filling Share Buffers, WALs and database with files that can be accessed directly by other programs, ex: images or PDF files through a Web server. Benefit of external file caching, advanced file content search and indexing.
    [Show full text]