Sqlite and Sqlite3

Sqlite and Sqlite3

SQLite and sqlite3 Statistical Computing & Programming Shawn Santo 06-12-20 1 / 41 Supplementary materials Companion videos Review and sqlite3 overview Creating tables Join operations Finer details Additional resources SQL Tutorial Package nodbi vignette 2 / 41 Recall 3 / 41 Databases A database is a collection of data typically stored in a computer system. It is controlled by a database management system (DBMS). There may be applications associated with them, such as an API. Types of DBMS: MySQL, Microsoft Access, Microsoft SQL Server, FileMaker Pro, Oracle Database, and dBASE. Types of databases: Relational, object-oriented, distributed, NoSQL, graph, and more. 4 / 41 Relational database management system A system that governs a relational database, where data is identified and accessed in relation to other data in the database. Relational databases generally organize data into tables comprised of fields and records. Many RDBMS use SQL to access data. 5 / 41 SQL SQL stands for Structured Query Language It is an American National Standards Institute standard computer language for accessing and manipulating RDBMS. There are different versions of SQL, but to be compliant with the American National Standards Institute the version must support the key query verbs. 6 / 41 Big picture Source: https://www.w3resource.com/sql/tutorials.php 7 / 41 Common SQL query structure Main verbs to get data: SELECT columns or computations FROM table WHERE condition GROUP BY columns HAVING condition ORDER BY column [ASC | DESC] LIMIT offset, count WHERE, GROUP BY, HAVING, ORDER BY, LIMIT are all optional. Primary computations: MIN, MAX, COUNT, SUM, AVG. We can perform these queries with dbGetQuery() and paste(). 8 / 41 Verb connections SQL dplyr SELECT select() table data frame WHERE filter() pre-aggregation/calculation GROUP_BY group_by() HAVING filter() post-aggregation/calculation ORDER BY arrange() with possibly a desc() LIMIT slice() 9 / 41 SQL arithmetic and comparison operators SQL supports the standard +, -, *, /, and % (modulo) arithmetic operators and the following comparison operators. Operator Description = Equal to > Greater than < Less than >= Greater than or equal to <= Less than or equal to <> Not equal to 10 / 41 SQL logical operators Operator Description ALL TRUE if all of the subquery values meet the condition AND TRUE if all the conditions separated by AND is TRUE ANY TRUE if any of the subquery values meet the condition BETWEEN TRUE if the operand is within the range of comparisons EXISTS TRUE if the subquery returns one or more records IN TRUE if the operand is equal to one of a list of expressions LIKE TRUE if the operand matches a pattern NOT Displays a record if the condition(s) is NOT TRUE OR TRUE if any of the conditions separated by OR is TRUE SOME TRUE if any of the subquery values meet the condition 11 / 41 sqlite3 12 / 41 SQLite and sqlite3 SQLite is a software library that provides a relational database management system. The lite in SQLite means light weight in terms of setup, database administration, and required resource. This is available on the DSS servers. In your terminal [sms185@numeric1 ~]$ which sqlite3 /usr/bin/sqlite3 Check out man sqlite3 From the summary: sqlite3 is a terminal-based front-end to the SQLite library that can evaluate queries interactively and display the results in multiple formats. sqlite3 can also be used within shell scripts and other applications to provide batch processing features. If you have a MAC, sqlite3 should be available on there as well. 13 / 41 Create a database First, in RStudio, let's create an object called oil. oil <- readr::read_table("http://users.stat.ufl.edu/~winner/data/oilimport.dat", col_names = c("year", "month", "month_series", "barrels_purchased", "total_value", "unit_price", "cpi") ) oil #> # A tibble: 420 x 7 #> year month month_series barrels_purchased total_value unit_price cpi #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 1973 1 1 102750 282153 2.75 42.6 #> 2 1973 2 2 95276 259675 2.73 42.9 #> 3 1973 3 3 112053 316283 2.82 43.3 #> 4 1973 4 4 98841 279958 2.83 43.6 #> 5 1973 5 5 123102 357042 2.9 43.9 #> 6 1973 6 6 118152 360020 3.05 44.2 #> 7 1973 7 7 101752 320591 3.15 44.3 #> 8 1973 8 8 148219 483085 3.26 45.1 #> 9 1973 9 9 124966 421856 3.38 45.2 #> 10 1973 10 10 134741 476763 3.54 45.6 #> # … with 410 more rows 14 / 41 Next, create a database called mydb.sqlite and add a table oil. library(DBI) # R Database Interface mydb <- dbConnect(RSQLite::SQLite(), "mydb.sqlite") dbWriteTable(mydb, "oil", oil) Check our database is there. [sms185@numeric1 sql]$ ls mydb.sqlite Load it with command sqlite3 mydb.sqlite. [sms185@numeric1 sql]$ sqlite3 mydb.sqlite SQLite version 3.26.0 2018-12-01 12:34:55 Enter ".help" for usage hints. sqlite> 15 / 41 Typing .help at the prompt (sqlite>) sqlite> .help .archive ... Manage SQL archives .auth ON|OFF Show authorizer callbacks .backup ?DB? FILE Backup DB (default "main") to FILE .bail on|off Stop after hitting an error. Default OFF .binary on|off Turn binary output on or off. Default OFF .cd DIRECTORY Change the working directory to DIRECTORY .changes on|off Show number of rows changed by SQL .check GLOB Fail if output since .testcase does not match .clone NEWDB Clone data into NEWDB from the existing database .... will reveal some of the help features. 16 / 41 Commands sqlite3 1. Query commands: sqlite3 just reads lines of input and passes them on to the SQLite library for execution. This will be the typical command you provide when you want to access, update, and merge data tables. 2. Dot commands: these are lines that begin with a dot (".") and are intercepted and interpreted by the sqlite3 program itself. These commands are typically used to change the output format of queries, or to execute certain prepackaged query statements. 17 / 41 Navigating sqlite3 To list all names and files of attached databases sqlite> .databases main: /home/fac/sms185/sql/mydb.sqlite To list all the tables sqlite> .tables oil View the current settings sqlite> .show echo: off eqp: off explain: auto headers: off mode: list nullvalue: "" output: stdout colseparator: "|" rowseparator: "\n" stats: off width: filename: mydb.sqlite 18 / 41 Table details To show the CREATE statements matching the specified table sqlite> .schema oil CREATE TABLE `oil` ( `year` REAL, `month` REAL, `month_series` REAL, `barrels_purchased` REAL, `total_value` REAL, `unit_price` REAL, `cpi` REAL ); Note the ; at the end. 19 / 41 Query Get the first 5 rows from oil. sqlite> SELECT * FROM oil ...> LIMIT 5; 1973.0|1.0|1.0|102750.0|282153.0|2.75|42.6 1973.0|2.0|2.0|95276.0|259675.0|2.73|42.9 1973.0|3.0|3.0|112053.0|316283.0|2.82|43.3 1973.0|4.0|4.0|98841.0|279958.0|2.83|43.6 1973.0|5.0|5.0|123102.0|357042.0|2.9|43.9 How about a nicer output? Change the mode and headers settings. sqlite> .mode column sqlite> .headers on sqlite> SELECT * FROM oil ...> LIMIT 5; year month month_series barrels_purchased total_value unit_price cpi ---------- ---------- ------------ ----------------- ----------- ---------- ---------- 1973.0 1.0 1.0 102750.0 282153.0 2.75 42.6 1973.0 2.0 2.0 95276.0 259675.0 2.73 42.9 1973.0 3.0 3.0 112053.0 316283.0 2.82 43.3 1973.0 4.0 4.0 98841.0 279958.0 2.83 43.6 1973.0 5.0 5.0 123102.0 357042.0 2.9 43.9 20 / 41 Examples For each year, find the month that the maximum number of barrels of oil were purchased. sqlite> SELECT year, month as MM, MAX(barrels_purchased) as Max_BBLs ...> FROM oil ...> GROUP BY year ...> ORDER BY Max_BBLs DESC ...> LIMIT 10; year MM Max_BBLs ---------- ---------- ---------- 2004.0 6.0 344729.0 2006.0 8.0 336528.0 2003.0 7.0 335511.0 2005.0 8.0 329039.0 2007.0 3.0 324248.0 2002.0 10.0 315720.0 2000.0 8.0 312955.0 2001.0 10.0 311803.0 1998.0 8.0 300227.0 1999.0 5.0 289457.0 21 / 41 Let's make a similar query but for unit price and only return the relevant columns. sqlite> SELECT year, MAX(unit_price) as max_unit_price ...> FROM oil ...> GROUP BY year ...> HAVING year BETWEEN 1975 AND 1990; year max_unit_price ---------- -------------- 1975.0 12.11 1976.0 12.59 1977.0 13.62 1978.0 13.59 1979.0 24.57 1980.0 33.94 1981.0 36.92 1982.0 35.53 1983.0 32.5 1984.0 28.05 1985.0 27.38 1986.0 25.8 1987.0 18.07 1988.0 22.4 1989.0 17.97 1990.0 30.08 22 / 41 Exercise Create a query that returns a table that has all the oil information where the average barrels purchased for a year exceeded 200,000. 23 / 41 Creating new tables from existing tables Create with command CREATE TABLE sqlite> CREATE TABLE oil_1973( ...> month INT, ...> barrels_purchased REAL, ...> total_value REAL, ...> unit_price REAL, ...> cpi REAL ...> ); We are specifying the table name, oil_1973, variables names, and their type. Add data with command INSERT INTO sqlite> INSERT INTO oil_1973 ...> SELECT month, barrels_purchased, total_value, unit_price, cpi ...> FROM oil ...> WHERE year = 1973; 24 / 41 Verify sqlite> .tables oil oil_1973 sqlite> SELECT * FROM oil_1973 ...> LIMIT 5; month barrels_purchased total_value unit_price cpi ---------- ----------------- ----------- ---------- ---------- 1 102750.0 282153.0 2.75 42.6 2 95276.0 259675.0 2.73 42.9 3 112053.0 316283.0 2.82 43.3 4 98841.0 279958.0 2.83 43.6 5 123102.0 357042.0 2.9 43.9 25 / 41 Creating new tables from outside data nukes <- readr::read_table("http://users.stat.ufl.edu/~winner/data/nuketest.dat", col_names = c("year", "month", "month_series", "tests")) nukes #> # A tibble: 576 x 4 #> year month month_series tests #> <dbl> <dbl> <dbl> <dbl> #> 1 1945 1 1 0 #> 2 1945 2 2 0 #> 3 1945 3 3 0 #> 4 1945 4 4 0 #> 5 1945 5 5 0 #> 6 1945 6 6 0 #> 7 1945 7 7 1 #> 8 1945 8 8 2 #> 9 1945 9 9 0 #> 10 1945 10 10 0 #> # … with 566 more rows readr::write_csv(nukes, path = "nukes.csv") 26 / 41 To import a CSV file of data as a table sqlite> .mode csv sqlite> .import nukes.csv nukes sqlite> SELECT * FROM nukes LIMIT 5; year,month,month_series,tests 1945,1,1,0 1945,2,2,0 1945,3,3,0 1945,4,4,0 1945,5,5,0 Adjust the mode for pretty output with .mode.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    41 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us