SQL Review ORDER by ORDER by Example

Total Page:16

File Type:pdf, Size:1020Kb

SQL Review ORDER by ORDER by Example Review SELECT [DISTINCT]… Step 5. π SQL FROM … Step 1. × σ CPS 216 WHERE … Step 2. Advanced Database Systems GROUP BY … Step 3. Grouping HAVING …; Step 4. Another σ 2 ORDER BY ORDER BY example • SELECT [DISTINCT] E1, E2, E3... • List all students, sort them by GPA (descending) FROM…WHERE…GROUP BY…HAVING… and then name (ascending) ORDER BY E [ASC | DESC], – SELECT SID, name, age, GPA i1 E [ASC | DESC], …; FROM Student i2 • ASC = ascending, DESC = descending ORDER BY GPA DESC, name; – ASC is the default option • Operational semantics – Technically, only output columns can appear in – After SELECT list has been computed and optional ORDER BY clause (some DBMS support more) duplicate elimination has been carried out, – Can use output index instead sort the output according to ORDER BY specification ORDER BY 4 DESC, 2; 3 4 Data modification: INSERT Data modification: DELETE • Insert one row • Delete everything – DELETE FROM Enroll; Example: Student 456 takes CPS 216 • Delete according to a WHERE condition – INSERT INTO Enroll VALUES (456, ’CPS 216’); Example: Student 456 drops CPS 216 • Insert the result of a query – DELETE FROM Enroll Example: Force everybody to take CPS 216 WHERE SID = 456 AND CID = ’CPS 216’; Example: Drop students with GPA lower than 1.0 from – INSERT INTO Enroll all CPS classes (SELECT SID, ’CPS 216’ FROM Student – DELETE FROM Enroll WHERE SID NOT IN (SELECT SID FROM Enroll WHERE SID IN (SELECT SID FROM Student WHERE CID = ’CPS 216’)); WHERE GPA < 1.0) AND CID LIKE ’CPS%’; 5 6 1 Data modification: UPDATE Views • Example: Student 142 changes name to “Barney” • A view is like a virtual table – UPDATE Student – Defined by a query, which describes how to compute SET name = ’Barney’ the view contents on the fly WHERE SID = 142; – DBMS stores the view definition query instead of • Example: Let’s be “fair”? view contents – UPDATE Student – Can be used in queries just like a regular table SET GPA = (SELECT AVG(GPA) FROM Student); – Update of every row causes average GPA to change – Average GPA is computed over the old Student table 7 8 Creating and dropping views Using views in queries • Example: CPS 216 roster • Example: find the average GPA of CPS 216 – CREATE VIEW CPS216Roster AS students SELECT SID, name, age, GPA – SELECT AVG(GPA) FROM CPS216Roster; FROM Student – To process the query, replace the reference to the WHERE SID IN (SELECT SID FROM Enroll view by its definition WHERE CID = ’CPS 216’); – SELECT AVG(GPA) • To drop a view (or table) FROM (SELECT SID, name, age, GPA FROM Student –DROP VIEW view_name; WHERE SID IN (SELECT SID – DROP TABLE table_name; FROM Enroll WHERE CID = ’CPS 216’)); 9 10 Why use views? Modifying views • To hide data from users • Doesn’t seems to make sense since views are • To hide complexity from users virtual • Logical data independence • But does make sense if that’s how users view the – If applications deal with views, we can change the database underlying schema without affecting applications • Goal: modify the base tables such that the – Recall physical data independence: change the modification would appear to have been physical organization of data without affecting accomplished on the view applications • Real database applications use tons of views 11 12 2 A simple case An impossible case CREATE VIEW StudentGPA AS CREATE VIEW HighGPAStudent AS SELECT SID, GPA FROM Student; SELECT SID, GPA FROM Student WHERE GPA > 3.7; DELETE FROM StudentGPA WHERE SID = 123; INSERT INTO HighGPAStudent translates to: VALUES(987, 2.5); • No matter what you do on the student table, DELETE FROM Student WHERE SID = 123; the inserted tuple won’t be in HighGPAStudent 13 14 A case with too many possibilities SQL92 updatable views • Single-table SFW CREATE VIEW AverageGPA(GPA) AS – No aggregation SELECT AVG(GPA) FROM Student; – No subqueries – Note that you can rename columns in view definition UPDATE AverageGPA SET GPA = 2.5; • Overly restrictive • Set everybody’s GPA to 2.5? • Still gets it wrong in some cases • Adjust everybody’s GPA by the same amount? – See the slide titled “An impossible case” • Just lower Bart’s GPA? 15 16 Incomplete information Solution 1 • Example: Student (SID, name, age, GPA) • A dedicated special value for each domain – GPA cannot be –1, so use –1 as a special value • Value unknown – SELECT AVG(GPA) FROM Student; • Oh no, it’s lower than I expected! – We don’t know Nelson’s age – SELECT AVG(GPA) FROM Student • Value not applicable WHERE GPA <> –1; – Nelson hasn’t taken any classes yet; what’s his GPA? • Complicates applications – Remember the pre-Y2K bug? • 09/09/99 was used as an invalid or missing date value • It’s tricky to make these assumptions! 17 18 3 Solution 2 SQL’s solution • A valid-bit column for every real column – Student (SID, name, name_is_valid, • A special value NULL age, age_is_valid, – Same for every domain GPA, GPA_is_valid) – Special rules for dealing with NULLs – Too much overhead – SELECT AVG(GPA) FROM Student WHERE GPA_valid; • Example: Student (SID, name, age, GPA) • Still complicates applications – <789, ’Nelson’, NULL, NULL> 19 20 Computing with NULLs Three-valued logic • TRUE = 1, FALSE = 0, UNKNOWN = 0.5 • When we operate on a NULL and another value • x AND y = min(x, y) (including another NULL) using +, –, etc., the x OR y = max(x, y) result is NULL NOT(x) = 1 – x • When we compare a NULL with another value (including another NULL) using =, >, etc., the • Aggregate functions ignore NULL, except result is UNKNOWN COUNT(*) • WHERE and HAVING clauses only select tuples if the condition evaluates to TRUE – UNKNOWN is insufficient 21 22 Unfortunate consequences Another problem • select avg(GPA) from Student; • Example: Who has NULL GPA values? select sum(GPA) / count(*) from Student; – select * from Student where GPA = NULL; – Not equivalent • Won’t work; never returns anything! – avg(GPA) = sum(GPA) / count(GPA) still holds – (select * from Student) except all • select * from Student; (select * from Student where GPA = 0 OR GPA<>0); •Ugly! select * from Student – New built-in predicates IS NULL and IS NOT NULL where GPA > 3.0 or GPA <= 3.0; select * from Student where GPA is null; – Not equivalent • Be careful: NULL breaks many equivalences 23 24 4 Recap Constraints • Covered • Restrictions on allowable data in a database –ORDER BY – In addition to the simple structure and type – Data modification statements restrictions imposed by the table definitions –Views – Declared as part of the schema – NULLs – Enforced by the DBMS • Skipped – Outerjoin • Why use constraints? – Alternative join syntax – Protect data integrity (catch errors) – Schema modification statements – Tell the DBMS about the data (so it can optimize •Next better) – Constraints 25 26 Types of constraints NOT NULL constraint example • create table Student • NOT NULL (SID integer not null, name varchar(30) not null, •Key email varchar(30), • Referential integrity age integer, GPA float); • General assertion • create table Course (CID char(10) not null, • Tuple- and attribute-based CHECKs title varchar(100) not null); • create table Enroll (SID integer not null, CID char(10) not null); 27 28 Key declaration Key declaration examples • At most one PRIMARY KEY per table • create table Student – Typically implies a primary index (SID integer not null primary key, name varchar(30) not null, Works on Oracle – Rows are stored inside the index, typically sorted by email varchar(30) unique, but not DB2: primary key value DB2 requires UNIQUE age integer, GPA float); key columns • Any number of UNIQUE keys per table • create table Course to be NOT NULL – Typically implies a secondary index (CID char(10) not null primary key, – Pointers to rows are stored inside the index title varchar(100) not null); • create table Enroll (SID integer not null, CID char(10) not null, primary key(SID, CID)); 29 30 5 Referential integrity example Referential integrity in SQL – Enroll.SID references Student.SID • Referenced column must be PRIMARY KEY – Enroll.CID references Course.CID • Referencing column is called FOREIGN KEY – If an SID appears in Enroll, it must appear in Student • Example declaration – If a CID appears in Enroll, it must appear in Course – create table Enroll – That is, no “dangling pointers” (SID integer not null references Student(SID), Student Enroll Course CID char(10) not null, SID name … SID CID CID title primary key(SID, CID), 142 Bart … 142 CPS 216 CPS 216 Advanced Data… foreign key CID references Course(CID)); 123 Milhouse … 142 CPS 214 CPS 130 Analysis of Algo… 857 Lisa … 123 CPS 216 CPS 214 Computer Net… 456 Ralph … 857 CPS 216 ... ... ... ... ... 857 CPS 130 ... ... 31 32 Enforcing referential integrity Deferred constraint checking Example: Enroll.SID references Student.SID • No-chicken-no-egg problem • Insert or update a Enroll tuple so it refers to a – create table Dept non-existent SID (name char(20) not null primary key, chair char(30) not null references Prof(name)); – Reject create table Prof • Delete or update a Student tuple whose SID is (name char(30) not null primary key, referenced by some Enroll tuple dept char(20) not null references Dept(name)); – Reject – The first INSERT will always violate a constraint – Cascade: ripple changes to all referring tuples • Deferred constraint checking is necessary – Set NULL: set all references to NULL – Check only at the end of a transaction – All three options can be specified in SQL – Allowed in SQL as an option 33 34 General assertion Tuple- and attribute-based CHECKs • CREATE ASSERTION assertion_name • Associated with a single table CHECK assertion_condition; • Only checked when a tuple or an attribute is • assertion_condition is checked for each inserted or updated modification that could potentially violate it • Example: • Example: Enroll.SID references Student.SID – CREATE TABLE Enroll – CREATE ASSERTION EnrollStudentRefIntegrity (SID integer not null CHECK (NOT EXISTS (SELECT * FROM Enroll CHECK (SID IN (SELECT SID FROM Student)), WHERE SID NOT IN CID …); (SELECT SID FROM Student))); – Is it a referential integrity constraint? • SQL3, but not all (perhaps no) DBMS supports it – Not quite; not checked when Student is modified 35 36 6 Next time Transactions! 37 7.
Recommended publications
  • Select Items from Proc.Sql Where Items>Basics
    Advanced Tutorials SELECT ITEMS FROM PROC.SQL WHERE ITEMS> BASICS Alan Dickson & Ray Pass ASG. Inc. to get more SAS sort-space. Also, it may be possible to limit For those of you who have become extremely the total size of the resulting extract set by doing as much of comfortable and competent in the DATA step, getting into the joining/merging as close to the raw data as possible. PROC SQL in a big wrJ may not initially seem worth it Your first exposure may have been through a training class. a 3) Communications. SQL is becoming a lingua Tutorial at a conference, or a real-world "had-to" situation, franca of data processing, and this trend is likely to continue such as needing to read corporate data stored in DB2, or for some time. You may find it easier to deal with your another relational database. So, perhaps it is just another tool systems department, vendors etc., who may not know SAS at in your PROC-kit for working with SAS datasets, or maybe all, ifyou can explain yourself in SQL tenus. it's your primary interface with another environment 4) Career growth. Not unrelated to the above - Whatever the case, your first experiences usually there's a demand for people who can really understand SQL. involve only the basic CREATE TABLE, SELECT, FROM, Its apparent simplicity is deceiving. On the one hand, you can WHERE, GROUP BY, ORDER BY options. Thus, the do an incredible amount with very little code - on the other, tendency is frequently to treat it as no more than a data extract you can readily generate plausible, but inaccurate, results with tool.
    [Show full text]
  • Database Foundations 6-8 Sorting Data Using ORDER BY
    Database Foundations 6-8 Sorting Data Using ORDER BY Copyright © 2015, Oracle and/or its affiliates. All rights reserved. Roadmap Data Transaction Introduction to Structured Data Definition Manipulation Control Oracle Query Language Language Language (TCL) Application Language (DDL) (DML) Express (SQL) Restricting Sorting Data Joining Tables Retrieving Data Using Using ORDER Using JOINS Data Using WHERE BY SELECT You are here DFo 6-8 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 3 Sorting Data Using ORDER BY Objectives This lesson covers the following objectives: • Use the ORDER BY clause to sort SQL query results • Identify the correct placement of the ORDER BY clause within a SELECT statement • Order data and limit row output by using the SQL row_limiting_clause • Use substitution variables in the ORDER BY clause DFo 6-8 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 4 Sorting Data Using ORDER BY Using the ORDER BY Clause • Sort the retrieved rows with the ORDER BY clause: – ASC: Ascending order (default) – DESC: Descending order • The ORDER BY clause comes last in the SELECT statement: SELECT last_name, job_id, department_id, hire_date FROM employees ORDER BY hire_date ; DFo 6-8 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 5 Sorting Data Using ORDER BY Sorting • Sorting in descending order: SELECT last_name, job_id, department_id, hire_date FROM employees 1 ORDER BY hire_date DESC ; • Sorting by column alias: SELECT employee_id, last_name, salary*12 annsal 2 FROM employees ORDER BY annsal ; DFo 6-8 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 6 Sorting Data Using ORDER BY Sorting • Sorting by using the column's numeric position: SELECT last_name, job_id, department_id, hire_date FROM employees 3 ORDER BY 3; • Sorting by multiple columns: SELECT last_name, department_id, salary FROM employees 4 ORDER BY department_id, salary DESC; DFo 6-8 Copyright © 2015, Oracle and/or its affiliates.
    [Show full text]
  • How Mysql Handles ORDER BY, GROUP BY, and DISTINCT
    How MySQL handles ORDER BY, GROUP BY, and DISTINCT Sergey Petrunia, [email protected] MySQL University Session November 1, 2007 Copyright 2007 MySQL AB The World’s Most Popular Open Source Database 1 Handling ORDER BY • Available means to produce ordered streams: – Use an ordered index • range access – not with MyISAM/InnoDB's DS-MRR – not with Falcon – Has extra (invisible) cost with NDB • ref access (but not ref-or-null) results of ref(t.keypart1=const) are ordered by t.keypart2, t.keypart3, ... • index access – Use filesort Copyright 2007 MySQL AB The World’s Most Popular Open Source Database 2 Executing join and producing ordered stream There are three ways to produce ordered join output Method EXPLAIN shows Use an ordered index Nothing particular Use filesort() on 1st non-constant table “Using filesort” in the first row Put join result into a temporary table “Using temporary; Using filesort” in the and use filesort() on it first row EXPLAIN is a bit counterintuitive: id select_type table type possible_keys key key_len ref rows Extra Using where; 1 SIMPLE t2 range a a 5 NULL 10 Using temporary; Using filesort 1 SIMPLE t2a ref a a 5 t2.b 1 Using where Copyright 2007 MySQL AB The World’s Most Popular Open Source Database 3 Using index to produce ordered join result • ORDER BY must use columns from one index • DESC is ok if it is present for all columns • Equality propagation: – “a=b AND b=const” is detected – “WHERE x=t.key ORDER BY x” is not • Cannot use join buffering – Use of matching join order disables use of join buffering.
    [Show full text]
  • Database SQL Call Level Interface 7.1
    IBM IBM i Database SQL call level interface 7.1 IBM IBM i Database SQL call level interface 7.1 Note Before using this information and the product it supports, read the information in “Notices,” on page 321. This edition applies to IBM i 7.1 (product number 5770-SS1) and to all subsequent releases and modifications until otherwise indicated in new editions. This version does not run on all reduced instruction set computer (RISC) models nor does it run on CISC models. © Copyright IBM Corporation 1999, 2010. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents SQL call level interface ........ 1 SQLExecute - Execute a statement ..... 103 What's new for IBM i 7.1 .......... 1 SQLExtendedFetch - Fetch array of rows ... 105 PDF file for SQL call level interface ....... 1 SQLFetch - Fetch next row ........ 107 Getting started with DB2 for i CLI ....... 2 SQLFetchScroll - Fetch from a scrollable cursor 113 | Differences between DB2 for i CLI and embedded SQLForeignKeys - Get the list of foreign key | SQL ................ 2 columns .............. 115 Advantages of using DB2 for i CLI instead of SQLFreeConnect - Free connection handle ... 120 embedded SQL ............ 5 SQLFreeEnv - Free environment handle ... 121 Deciding between DB2 for i CLI, dynamic SQL, SQLFreeHandle - Free a handle ...... 122 and static SQL ............. 6 SQLFreeStmt - Free (or reset) a statement handle 123 Writing a DB2 for i CLI application ....... 6 SQLGetCol - Retrieve one column of a row of Initialization and termination tasks in a DB2 for i the result set ............ 125 CLI application ............ 7 SQLGetConnectAttr - Get the value of a Transaction processing task in a DB2 for i CLI connection attribute .........
    [Show full text]
  • Firebird 3 Windowing Functions
    Firebird 3 Windowing Functions Firebird 3 Windowing Functions Author: Philippe Makowski IBPhoenix Email: pmakowski@ibphoenix Licence: Public Documentation License Date: 2011-11-22 Philippe Makowski - IBPhoenix - 2011-11-22 Firebird 3 Windowing Functions What are Windowing Functions? • Similar to classical aggregates but does more! • Provides access to set of rows from the current row • Introduced SQL:2003 and more detail in SQL:2008 • Supported by PostgreSQL, Oracle, SQL Server, Sybase and DB2 • Used in OLAP mainly but also useful in OLTP • Analysis and reporting by rankings, cumulative aggregates Philippe Makowski - IBPhoenix - 2011-11-22 Firebird 3 Windowing Functions Windowed Table Functions • Windowed table function • operates on a window of a table • returns a value for every row in that window • the value is calculated by taking into consideration values from the set of rows in that window • 8 new windowed table functions • In addition, old aggregate functions can also be used as windowed table functions • Allows calculation of moving and cumulative aggregate values. Philippe Makowski - IBPhoenix - 2011-11-22 Firebird 3 Windowing Functions A Window • Represents set of rows that is used to compute additionnal attributes • Based on three main concepts • partition • specified by PARTITION BY clause in OVER() • Allows to subdivide the table, much like GROUP BY clause • Without a PARTITION BY clause, the whole table is in a single partition • order • defines an order with a partition • may contain multiple order items • Each item includes
    [Show full text]
  • Aggregate Order by Clause
    Aggregate Order By Clause Dialectal Bud elucidated Tuesdays. Nealy vulgarizes his jockos resell unplausibly or instantly after Clarke hurrah and court-martial stalwartly, stanchable and jellied. Invertebrate and cannabic Benji often minstrels some relator some or reactivates needfully. The default order is ascending. Have exactly match this assigned stream aggregate functions, not work around with security software development platform on a calculation. We use cookies to ensure that we give you the best experience on our website. Result output occurs within the minimum time interval of timer resolution. It is not counted using a table as i have group as a query is faster count of getting your browser. Let us explore it further in the next section. If red is enabled, when business volume where data the sort reaches the specified number of bytes, the collected data is sorted and dumped into these temporary file. Kris has written hundreds of blog articles and many online courses. Divides the result set clock the complain of groups specified as an argument to the function. Threat and leaves only return data by order specified, a human agents. However, this method may not scale useful in situations where thousands of concurrent transactions are initiating updates to derive same data table. Did you need not performed using? If blue could step me first what is vexing you, anyone can try to explain it part. It returns all employees to database, and produces no statements require complex string manipulation and. Solve all tasks to sort to happen next lesson. Execute every following query access GROUP BY union to calculate these values.
    [Show full text]
  • Hive Where Clause Example
    Hive Where Clause Example Bell-bottomed Christie engorged that mantids reattributes inaccessibly and recrystallize vociferously. Plethoric and seamier Addie still doth his argents insultingly. Rubescent Antin jibbed, his somnolency razzes repackages insupportably. Pruning occurs directly with where and limit clause to store data question and column must match. The ideal for column, sum of elements, the other than hive commands are hive example the terminator for nonpartitioned external source table? Cli to hive where clause used to pick tables and having a column qualifier. For use of hive sql, so the source table csv_table in most robust results yourself and populated using. If the bucket of the sampling is created in this command. We want to talk about which it? Sql statements are not every data, we should run in big data structures the. Try substituting synonyms for full name. Currently the where at query, urban private key value then prints the where hive table? Hive would like the partitioning is why the. In hive example hive data types. For the data, depending on hive will also be present in applying the example hive also widely used. The electrician are very similar to avoid reading this way, then one virtual column values in data aggregation functionalities provided below is sent to be expressed. Spark column values. In where clause will print the example when you a script for example, it will be spelled out of subquery must produce such hash bucket level of. After copy and collect_list instead of the same type is only needs to calculate approximately percentiles for sorting phase in keyword is expected schema.
    [Show full text]
  • SQL from Wikipedia, the Free Encyclopedia Jump To: Navigation
    SQL From Wikipedia, the free encyclopedia Jump to: navigation, search This article is about the database language. For the airport with IATA code SQL, see San Carlos Airport. SQL Paradigm Multi-paradigm Appeared in 1974 Designed by Donald D. Chamberlin Raymond F. Boyce Developer IBM Stable release SQL:2008 (2008) Typing discipline Static, strong Major implementations Many Dialects SQL-86, SQL-89, SQL-92, SQL:1999, SQL:2003, SQL:2008 Influenced by Datalog Influenced Agena, CQL, LINQ, Windows PowerShell OS Cross-platform SQL (officially pronounced /ˌɛskjuːˈɛl/ like "S-Q-L" but is often pronounced / ˈsiːkwəl/ like "Sequel"),[1] often referred to as Structured Query Language,[2] [3] is a database computer language designed for managing data in relational database management systems (RDBMS), and originally based upon relational algebra. Its scope includes data insert, query, update and delete, schema creation and modification, and data access control. SQL was one of the first languages for Edgar F. Codd's relational model in his influential 1970 paper, "A Relational Model of Data for Large Shared Data Banks"[4] and became the most widely used language for relational databases.[2][5] Contents [hide] * 1 History * 2 Language elements o 2.1 Queries + 2.1.1 Null and three-valued logic (3VL) o 2.2 Data manipulation o 2.3 Transaction controls o 2.4 Data definition o 2.5 Data types + 2.5.1 Character strings + 2.5.2 Bit strings + 2.5.3 Numbers + 2.5.4 Date and time o 2.6 Data control o 2.7 Procedural extensions * 3 Criticisms of SQL o 3.1 Cross-vendor portability * 4 Standardization o 4.1 Standard structure * 5 Alternatives to SQL * 6 See also * 7 References * 8 External links [edit] History SQL was developed at IBM by Donald D.
    [Show full text]
  • Relational Calculus Introduction • Contrived by Edgar F
    Starting with SQL Server and Azure SQL Database Module 5: Introducing the relational model Lessons • Introduction to the relational model • Domains • Relational operators • Relational algebra • Relational calculus Introduction • Contrived by Edgar F. Codd, IBM, 1969 • Background independent • A simple, yet rigorously defined concept of how users perceive data – The relational model represents data in the form of two- dimension tables, i.e. relations – Each table represents some real-world entity (person, place, thing, or event) about which information is collected – Information principle: all information is stored in relations A Logical Point of View • A relational database is a collection of tables – The organization of data into relational tables is known as the logical view of the database – The way the database software physically stores the data on a computer disk system is called the internal view – The internal view differs from product to product – Interchangeability principle: a physical relation can always be replaced by a virtual one (e.g., a view can replace a table) Tuples • A tuple is a set of ordered triples of attributes – A triple consists of attribute name, attribute type and attribute value – Every attribute of a tuple contains exactly one value of appropriate type – The order of the attributes is insignificant • Every attribute must have unique name – A subset of attributes of tuple is a tuple Relations • A relation is a variable consisting of a set of tuples – The order of the attributes is insignificant – Every attribute
    [Show full text]
  • SQL to Hive Cheat Sheet
    We Do Hadoop Contents Cheat Sheet 1 Additional Resources 2 Query, Metadata Hive for SQL Users 3 Current SQL Compatibility, Command Line, Hive Shell If you’re already a SQL user then working with Hadoop may be a little easier than you think, thanks to Apache Hive. Apache Hive is data warehouse infrastructure built on top of Apache™ Hadoop® for providing data summarization, ad hoc query, and analysis of large datasets. It provides a mechanism to project structure onto the data in Hadoop and to query that data using a SQL-like language called HiveQL (HQL). Use this handy cheat sheet (based on this original MySQL cheat sheet) to get going with Hive and Hadoop. Additional Resources Learn to become fluent in Apache Hive with the Hive Language Manual: https://cwiki.apache.org/confluence/display/Hive/LanguageManual Get in the Hortonworks Sandbox and try out Hadoop with interactive tutorials: http://hortonworks.com/sandbox Register today for Apache Hadoop Training and Certification at Hortonworks University: http://hortonworks.com/training Twitter: twitter.com/hortonworks Facebook: facebook.com/hortonworks International: 1.408.916.4121 www.hortonworks.com We Do Hadoop Query Function MySQL HiveQL Retrieving information SELECT from_columns FROM table WHERE conditions; SELECT from_columns FROM table WHERE conditions; All values SELECT * FROM table; SELECT * FROM table; Some values SELECT * FROM table WHERE rec_name = “value”; SELECT * FROM table WHERE rec_name = "value"; Multiple criteria SELECT * FROM table WHERE rec1=”value1” AND SELECT * FROM
    [Show full text]
  • Going OVER and Above with SQL
    Going OVER and Above with SQL Tamar E. Granor Tomorrow's Solutions, LLC Voice: 215-635-1958 Email: [email protected] The SQL 2003 standard introduced the OVER keyword that lets you apply a function to a set of records. Introduced in SQL Server 2005, this capability was extended in SQL Server 2012. The functions allow you to rank records, aggregate them in a variety of ways, put data from multiple records into a single result record, and compute and use percentiles. The set of problems they solve range from removing exact duplicates to computing running totals and moving averages to comparing data from different periods to removing outliers. In this session, we'll look at the OVER operator and the many functions you can use with it. We'll look at a variety of problems that can be solved using OVER. Going OVER and Above with SQL Over the last couple of years, I’ve been exploring aspects of SQL Server’s T-SQL implementation that aren’t included in VFP’s SQL sub-language. I first noticed the OVER keyword as an easy way to solve a problem that’s fairly complex with VFP’s SQL, getting the top N records in each of a set of groups with a single query. At the time, I noticed that OVER had other uses, but I didn’t stop to explore them. When I finally returned to see what else OVER could do, I was blown away. In recent versions of SQL Server (2012 and later), OVER provides ways to compute running totals and moving averages, to put data from several records of the same table into a single result record, to divide records into percentile groups and more.
    [Show full text]
  • Efficient Processing of Window Functions in Analytical SQL Queries
    Efficient Processing of Window Functions in Analytical SQL Queries Viktor Leis Kan Kundhikanjana Technische Universitat¨ Munchen¨ Technische Universitat¨ Munchen¨ [email protected] [email protected] Alfons Kemper Thomas Neumann Technische Universitat¨ Munchen¨ Technische Universitat¨ Munchen¨ [email protected] [email protected] ABSTRACT select location, time, value, abs(value- (avg(value) over w))/(stddev(value) over w) Window functions, also known as analytic OLAP functions, have from measurement been part of the SQL standard for more than a decade and are now a window w as ( widely-used feature. Window functions allow to elegantly express partition by location many useful query types including time series analysis, ranking, order by time percentiles, moving averages, and cumulative sums. Formulating range between 5 preceding and 5 following) such queries in plain SQL-92 is usually both cumbersome and in- efficient. The query normalizes each measurement by subtracting the aver- Despite being supported by all major database systems, there age and dividing by the standard deviation. Both aggregates are have been few publications that describe how to implement an effi- computed over a window of 5 time units around the time of the cient relational window operator. This work aims at filling this gap measurement and at the same location. Without window functions, by presenting an efficient and general algorithm for the window it is possible to state the query as follows: operator. Our algorithm is optimized for high-performance main- memory database systems and has excellent performance on mod- select location, time, value, abs(value- ern multi-core CPUs. We show how to fully parallelize all phases (select avg(value) of the operator in order to effectively scale for arbitrary input dis- from measurement m2 tributions.
    [Show full text]