CS317 File and Systems

Lecture 1 - Introduction

August 28, 2017  Sam Siewert Dr. Sam Siewert 1984-85 UC Berkeley – Philosophy/Physics

University of Notre Dame, BS - Aerospace/Mechanical Engineering

1985-89 Johnson Space Center, U. of Houston – UHCL Computer Engineering, Mission Control Center

U. of Colorado, Boulder, MS/PhD – JPL, 1989-92 Colorado Space Grant, Computer Science

CU Boulder Senior Instructor, Adjunct Professor, CTO, Architect, Developer/Engineer in Local Start-ups 1992-2012

U. of Alaska, Anchorage, Assistant Professor, Computer Systems Engineering, Alaska Space Grant 2012-14 Embry Riddle Prescott, Assistant Professor, CESE  Sam Siewert 2

Related Industry Background

General Experience (~25 Years in Embedded and Scalable Systems) – 12 Years NASA JSC, NASA JPL, CU, Ball Aerospace – 12+ Years Commercial Telecomm, Storage/Networks, Embedded, Digital Video

Software Engineering – NASA Johnson and JPL (Shuttle Ascent/Entry Guidance, Deep Space) – Intel, Emulex, Start-ups SAS CTO – RAID, HPC – iSCSI, Fiber Channel, Infiniband – Large Boxes of Disk Drives - JBODs – Large Boxes of SSDs - JBOF? RAID

Consulting – Graphics, Storage and Networking – Advanced RAID and Erasure Codes

 Sam Siewert 3

Learning Objectives Introduction - Block Storage, Structured and Unstructured Data – Block Storage Devices [Disk Drives, SSD, Persistent Memory] – Split between Files and [Integration of Both]

DML - Data Manipulation using SQL (ISO Standard)

DDL - Data Definition Language, Database Design with SQL – Logical Design [Schema] – Physical Design [Hosting DB]

Theory of Databases – Relational, – OODBMS (SQL Extensions, C++ Alternatives) – NoSQL

DBA - Database Administration and Security

File and Database Systems – Physical Implementation and Scaling [Block Partition, File], – Indexing [ISAM, B Tree, B+ Tree, R-Tree], – Network Access [Connectors], – Web Front Ends  Sam Siewert 4 File and DB Systems Experience

2002-2006 – Emulex (Intel), Chip-down Fiber Channel

2006-2010 – Atrato Inc. (Start-up), Scalable HDD/SSD Hybrid RAID Systems (10GE, 4/8G Fiber Channel)

2010-12 – Intel Corporation – Beyond Software RAID Research

Low-Level! - Block Layer Below Oracle, MS SQL, MySQL

Self-Taught on MySQL (It’s Like Snowboarding – Easy to Learn, Hard to Master)  Sam Siewert 5 Current Research - Join Us! DHS Arctic Domain Awareness – Smart Cameras & Information Fusion [2013-15] ERAU ICARUS Drone Net [2016 – present] 5th Big Data Silicon Valley, Drone Net, NASA UTM

Machine Vision

Machine Learning

Big Data Analytics

GP-GPUs, FPGA

Google TPU

NAS Storage

Structured & Unstructured Data

 Sam Siewert 6 Course Goals and Outline New Textbook! Database Systems: Introduction to Simpler Databases and Data Warehouses, Nenad More Hands On! Jukic, Susan Vrbsky, Svetlozar Nestorov (ISBN 978-1-94-315319-0) publisher link, Amazon

… two primary goals for students; the first goal is to learn the fundamentals of relational, object-relational and object-oriented database systems and the second goal is to have hands-on experience on database design, implementation, management and programming. Old Textbook http://mercury.pr.erau.edu/~siewerts/cs317/ Good Pro Reference Oracle Systems Current Syllabus

 Sam Siewert 7 Why SQL and MySQL Wide Use, Partially Open Source (Alternate – MariaDB) RDBMS Majority of Market Share – 2014 Survey People Seem to Like it! (90% Neutral to Better than)

Information Week TechDigest, March 2014 “2014 State of Database Tech”

 Sam Siewert 8 File System and DBMS Code

Take Poll on Who has had CS125 [C programming], CS225 [C++ programming], CS315 [Data structures]

ALL must Learn SQL and Use PRCLab Web Interface and Command Line Interface

Assignment #5 Options [Teams] – Dig Deeper into DBA SQL Security [Non C/C++ Programmer Option] – DBE Design to Deploy a DDL Schema and Populate with DML Test Data – Build a C/C++ Connector [or Java JDBC]

ALL must Learn OOA/OOD and Basic OODBMS Concepts

 Sam Siewert 9 PRClab – Linux for C/C++ Option #1 – Use PRClab, prclab.pr.erau.edu via SSH – Recommend Putty connection with SSH and X11 forwarding – Code Development (GCC/g++, Make, etc.) - http://mercury.pr.erau.edu/~siewerts/cs317/documents/Linux/ – Debugging C/C++ Source Code using DDD (http://www.gnu.org/software/ddd/manual/html_mono/ddd.html ) – Verification and Validation of C/C++ Implementations – General Linux System (RHEL 6.5) – http://prclab.pr.erau.edu/adminer.php (PHPAdmin login) – [email protected] – For Help with PRClab login, etc.

Option #2 – Use Virtual-Box Linux with Centos 6.5 Install of Ubuntu 14.04 LTS (Both Supported) – Must Have Windows, Macintosh or Linux Personal PC – http://mercury.pr.erau.edu/~siewerts/cs317/documents/Linux/Linux- Development-Getting-Started.pdf

 Sam Siewert 10 Why Work with Linux and Virtual Box… From Mobiles to Super-Computing to Datacenters

Embedded/Mobile Google Tianhe – 33+ Pflops

http://www.top500.org/

From Android Mobiles to GIS and Digital Video Services

Huge Value in Open Source Drivers, Tools, and Applications – Speeds Up Time to Market Oracle Virtual Box – Great Cross OS Test Environment  Sam Siewert 11 How We’ll Do It 1/3 Knowledge, Concepts, Theory – Lectures/Reading (On- going) – Lectures related to Connolly-Begg Textbook and Instructor’s Experience – Logical and Physical DBMS – Review of File systems and Key Differences with DBMS – LAMP – Linux, Apache, MySQL, PHP – Discussions

1/3 Practice – PRClab or VB-Linux LAMP – SQL, C code, PHP or Python if you Wish – Building, Modifying, Querying and Using MySQL RDBMS (Oracle), http://www.mysql.com/ – Logical RDBMS Design, Implementation and Management – Physical RDBMS Data structures (B-tree), Physical Ext-n I-node structures and debugfs

1/3 Project [DBA, DBE] – Group Project to Build a Significant Database on PRClab – Final Assignment

 Sam Siewert 12 Administrivia Introductions – Instructor (Office Hours) – Office Hours – Students (Introductions) – Please do Collaborate, but cite well! – Policies - http://mercury.pr.erau.edu/~siewerts/cs317/policies/

Mercury and CANVAS (So Happy Together!) – CANVAS Assignment Management Tool - https://erau.instructure.com/ – Access via ERNIE - https://ernie.erau.edu – Backup Mercury Website - http://mercury.pr.erau.edu/~siewerts/cs317/

Course Information – Attendance & E-mail list (please sign up on sheet being passed around) – Lecture Notes at http://mercury.pr.erau.edu/~siewerts/cs317/documents/ – Will post Assignments on CANVAS

Must have PRClab account OR VB-Linux - http://prclab.pr.erau.edu/adminer.php

I highly recommend both if possible, but PRClab is sufficient

 Sam Siewert 13 MySQL on Linux (LAMP) Skills File system / Block Storage

Web Client

Apache

PHP Introduction Session

MySQL (File System vs. DBMS)

Linux OS

SAN/NAS

25 August 2014  Sam Siewert 4-tier PRClab with Adminer LAMP

First tier – Your client Web browser

Apache Second tier Web Server – Apache Web server

Adminer Third tier PHP – PHP interface between Web server and MySQL MySQL Fourth tier – MySQL / Linux Storage

 Sam Siewert 15 File System Concepts – Review? Name Space Meta-Data File-Data I-nodes Blocks with Byte Offsets Belong to Name Space Files

 Sam Siewert 16 Volume Use by File and Folder

 Sam Siewert 17 Make a File System

mkfs –t ext3 /dev/sdb1 Install Name Space and Metadata on Volume

 Sam Siewert 18 Use a File System Block Access via dd in Linux – e.g. Clone Volume Name Space Access via Shell or Browser

 Sam Siewert 19 MySQL Command Line and PHPAdmin

Why and How is a RDBMS Different? Login to MySQL via Web http://prclab.pr.erau.edu/adminer.php Sign up for account today!!! Explore Read Tutorial Docs - http://dev.mysql.com/doc/index- topic.html Login to MySQL via SSH Shell (Putty)

In Case You Want More (NOT REQUIRED) Install Oracle Virtual Box Install Centos 6.5 or Ubuntu 14.04 LTS Install MySQL Install Apache and PHP Setup PHPAdmin on localhost

 Sam Siewert 20

Next Time …

DBMS vs. File System (read DBMS Chapter #1) – Come to Class Prepared to Discuss – Why is DBMS Better then File System – Why do we Use File Systems still? – What is NoSQL? (Not in Book, http://en.wikipedia.org/wiki/Nosql) – Structured vs. Unstructured Data? – For Discussion

Assignment #1 Discussion – I will Post Every Other Wednesday, We’ll Discuss, Due Following Week on Friday – Late Assignments – 10% Penalty for Monday Turn-in, After Monday, only with Instructor Permission

 Sam Siewert 21