Nosql Primer & How Introductions
Total Page:16
File Type:pdf, Size:1020Kb
Like Agile, It’s Just a Free-for-All for the Lazy What, Where, When, Why NoSQL Primer & How Introductions ❖ William Klos ❖ Centric Consulting ❖ Senior Architect ❖ National Lead for Cloud Computing ❖ Job: Design/Build Cloud & Distributed Architectures, Strategy ❖ Hobbies: Programming Languages, Distributed Algorithms, Text Editors, Command Lines & Talking About the Old Days ❖ e: [email protected] ❖ t: @williamklos ❖ cis: 73077,1601 Agenda ❖ A Little History ❖ Define NoSQL ❖ Types of NoSQL Databases ❖ Some Theory ❖ Design Considerations ❖ When to Use ❖ Q&A Personal “NoSQL” History Relational DBMS ❖ Proprietary Data Access Language ❖ STORE “Bill” TO FirstName ❖ @5,10 SAY Salary PICTURE “$99,999.99” ❖ DO WHILE…ENDDO ❖ IF…ELSE…ENDIF ❖ see also: Clipper, FoxBASE+ Relational DBMS ❖ AS/400 DB2 ❖ Query/400 was default choice and was NOT SQL. ❖ Separate License! ❖ Fine for Simple Reports and work tables. ❖ Complex reports usually got an RPG program. ❖ 3.5 NF FTW! Post-Relational DBMS ❖ PICK ❖ Technically, pre-dates the relational model. ❖ Row & Column Schemas ❖ do while !eol & !eof ❖ Multi-Value ❖ Field Separators & Value Separators ❖ Accessed via PickBASIC applications. 001|klos|william|100}200}300 Technically, all of these databases were “NoSQL”, but none of them really seem to fit today's definition. Definition “NoSQL” Definitions 1. NO SQL WAS USED TO ACCESS THIS DATA! 2. Any non-Relational Database 3. Not *only* SQL is used to access/store data. How about “Schemaless”? “Schemaless” Definitions • Free-For-All Structure or No Structure at All “Dogs and cats living together. Mass hysteria!” • Isn’t That What Lotus Notes Used? Schemas? I Have One! type ScubaUser struct { ID string `json:"id"` // User ID. UserName string `json:"username" binding:"required"` Password string `json:"password" binding:"required"` LastName string `json:"last_name" binding:"required"` FirstName string `json:"first_name" binding:"required"` PicURL string `json:"pic_url"` Email string `json:"email" binding:"required"` Phone string `json:"phone" binding:"required"` TransmitterID string `json:"transmitter_id"` CompanyID string `json:"company_id" binding:"required"` FacilityID string `json:"facility_id" binding:"required"` // ReferenceID string `json:"reference_id"` // Reference # of an external system. IsConfirmed bool `json:"is_confirmed"` // IsOnDuty bool `json:"is_onduty"` // LastClockIn time.Time `json:"last_clockin"` // FacilityRoomID string `json:"facilityroom_id"` // Current Room in which the User is located. LastRoomEnter time.Time `json:"last_roomenter"` // Timestamp of the last time the person entered a room. LastHandWash time.Time `json:"last_handwash"` // Last time the User was seen washing their hands.. LoginCount int `json:"login_count"` LastLogin time.Time `json:"last_login"` AuditFields Audit `json:"audit"` } // A Better Definition? Where the data access language is not relevant. Where the structure, or lack thereof, of the data is not relevant. A Better Definition Where a SQL schema is enforced on a DB write by the DB itself, a NoSQL schema is enforced on the DB read by the application (with a little on the DB write by the DB). NoSQL Hallmarks ❖ “Shared Nothing” Architectures can Scale Linearly ❖ No tables to tie together, no DB enforced linkages between records. ❖ Support of Structured, Semi-Structured, or Unstructured Data ❖ Read/Write Efficiency ❖ Not a lot of JOINs and INDEXes to keep up-to-date. ❖ Designed to Be Clustered and Decentralized with Built-In Failover ❖ Easier to Include in Automation Schemes ❖ Standing up a complete cluster and siphoning data into it can be done with a few lines of script. NoSQL Hallmarks ❖ Introduces “Right Tool/Right Job” Thinking ❖ Easier to Build & Destroy in a Disposable Environments ❖ Treat your servers like cattle, not pets. ❖ Very Easy to Develop Against ❖ Read, Write, Delete (usually no Update) ❖ I can “curl” my data. ❖ Relatively Easy to Maintain ❖ Which is good because you may have hundreds to thousands of them ❖ Awesome for Transient Data NoSQL Hallmarks But you do have to think differently. Types of NoSQL Databases Types of NoSQL Databases ❖ Document Database ❖ Data Structure Server ❖ Key/Value Store ❖ Object Database ❖ Time Series Database ❖ Triple/Quad (RDF) ❖ Graph Database ❖ Other ❖ Correlation Database Document Database Document Database ❖ One of the more popular ❖ Examples flavors of NoSQL. ❖ Couchbase & CouchDB ❖ Typically stores data as JSON structures. ❖ MongoDB ❖ Also YAML, BSON, or XML ❖ DocumentDB ❖ Easy to Read/Write Object ❖ Lotus Notes Structures ❖ RethinkDB ❖ Append Writes, No Updates ❖ Versioned Documents Document Database { { "id": "c::6b7a2336-4b04-4644-ab6e-5a0ec822e07f", "id": "x::7d4d8493-341b-4466-8ba9-dced3752e905", "name": "CENTRIC CONSULTING", "vendor_id": "d861357d974461dd", "relays": [ "serial_nbr": "SCU-UAB-001", "r::cdc649d6-2e33-4ad9-900e-43d84e27f642" "battery": 1700, ], "lumens": 450000, "transmitters": [ "temp": 0, "x::7d4d8493-341b-4466-8ba9-dced3752e905" "type": "CARD", ], "is_registered": true, "audit": { "user_id": "u::5c77ccb1-0dc8-4b1c-ad8d-b72b00fcfcf2", "is_active": true, "regstamp": "2016-09-08T20:38:53.348147409Z", "is_ready_to_delete": false, "last_seen": "2016-09-15T10:55:53.348147409Z", "is_system": false, "audit": { "objtype": "ScubaCompany", "is_active": true, "chghash": "64ccd00535e8020f0d3b5d3e849fe6e3", "is_ready_to_delete": false, "adduser": "SCUBA System", "is_system": false, "chguser": "SCUBA System", "objtype": "ScubaTransmitter", "addstamp": "2016-09-08T14:57:03.369981495Z", "chghash": "2b35b2ae7edbc5ebb1f17823127b024b", "chgstamp": "2016-09-08T20:38:53.304628131Z", "adduser": "SCUBA System", "addfunc": "main.(*ScubaCompany).makedefault", "chguser": "SCUBA System", "chgfunc": "main.(*ScubaCompany).save" "addstamp": "2016-09-08T20:38:53.264638132Z", } "chgstamp": "2016-09-08T20:38:53.368619741Z", Couchbase N1QL select count(*) as count from scuba where audit.objtype = "ScubaTransmitter" type ScubaTransmitter struct { ID string `json:"id"` // Transmitter ID VendorID string `json:"vendor_id"` // SerialNumber string `json:"serial_nbr"` // Easily identifiable # or name Battery int `json:"battery"` // in mV Lumens int `json:"lumens"` // in Lumens Temperature float32 `json:"temp"` // in x.xx C Rssi inttype Audit `json:"rssi"` struct { // beacon signal level at BLE device Type string IsActive `json:"type" binding:"required"`bool `json:"is_active" // Hardware xml:"is_active"` form: card, puck, etc. IsRegistered bool IsReadyToDelete `json:"is_registered"` bool `json:"is_ready_to_delete" // Is the card registered xml:"is_ready_to_delete"` to a user? UserID string IsSystem `json:"user_id"` bool `json:"is_system" // User toxml:"is_system"` which the card is registered. RegStamp time.TimeObjType `json:"regstamp"` string `json:"objtype" // Timestamp xml:"objtype"` the card was registered. FirstSeen time.TimeChgHash `json:"first_seen"` string `json:"chghash,omitempty" // When was the cardxml:"chghash"` first seen? LastSeen time.TimeAddUser `json:"last_seen"` string `json:"adduser,omitempty" // When was the cardxml:"adduser"` last seen? LastSeenBy string ChgUser `json:"last_seen_by"` string `json:"chguser,omitempty" // What Relay saw xml:"chguser"` this Relay last. AuditFields Audit AddStamp `json:"audit"` time.Time `json:"addstamp,omitempty" // Standard Audit xml:"addstamp"`Fields } // ScubaTransmitter ChgStamp time.Time `json:"chgstamp,omitempty" xml:"chgstamp"` AddFunc string `json:"addfunc,omitempty" xml:"addfunc"` ChgFunc string `json:"chgfunc,omitempty" xml:"chgfunc"` } Couchbase N1QL select id from scuba where array_length(facilities) > 0 and audit.objtype = "ScubaCompany" [ { "id": "c::6b7a2336-4b04-4644-ab6e-5a0ec822e07f" }, { "id": "c::d641db14-0158-4924-872e-e547bb7f342d" } ] Couchbase N1QL select facilities, audit.* from scuba [ { where facilities[0] = 'f::fb9caf8e-1f30-4e27-a735-081c10655c44' "addfunc": "main.(*ScubaCompany).makedefault", "addstamp": "2016-09-08T14:57:03.234699776Z", and audit.objtype = "ScubaCompany" "adduser": "SCUBA System", "chgfunc": "main.(*ScubaCompany).save", "chghash": "523f5473904b486c803e95a11d3060a6", "chgstamp": "2016-09-08T14:57:09.825765729Z", "chguser": "SCUBA System", "facilities": [ "f::fb9caf8e-1f30-4e27-a735-081c10655c44" ], "is_active": true, "is_ready_to_delete": false, "is_system": false, "objtype": "ScubaCompany" } ] Key/Value Store Key/Value Store ❖ Giant associative array or ❖ Examples map. ❖ Oracle NoSQL ❖ Consistency models can ❖ Riak be RAM-only, eventual consistency, append- ❖ DynamoDB only, even serializable. ❖ Voldemort ❖ The “value” side can be ❖ Cassandra any data, any structure, ❖ Redis any format. Time Series Database Time Series Database ❖ Series of data points in time ❖ Examples order. ❖ Riak-TS ❖ Two Major Types of Data: ❖ InfluxDB ❖ Regular Interval ❖ ❖ Irregular (Event) Informed Time Series % 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 ❖ y = [112 115 145 171 196 204 242 284 315 340 360 417❖ % Jan Handles 118 126 150time 180 ranges,196 188 233 date277 301 318 342 391 Graphite % Feb 132 141 178 193 236 235 267 317 356 362 406 419 % Mar 129 135 163 181 235 227 269 313 348 348 396 461 % Apr rollup, 121 125 time 172 183 zones 229 234 and 270 318 other 355 363 420 472 % May 135 149 178 218 243 264 315 374 422 435 472 535❖ % Jun 148 170 199 230 264 302 364 413 465 491 548 622 RRDtool % Jul notoriously 148 170 199 242 difficult 272 293 347 date/ 405 467 505 559 606 % Aug 136 158 184 209 237 259 312 355 404 404 463 508 % Sep 119 133 162 191 211 229 274 306 347 359 407 461 % Oct