Apache Hbase, the Scaling Machine Jean-Daniel Cryans Software Engineer at Cloudera @Jdcryans
Total Page:16
File Type:pdf, Size:1020Kb
Apache HBase, the Scaling Machine Jean-Daniel Cryans Software Engineer at Cloudera @jdcryans Tuesday, June 18, 13 Agenda • Introduction to Apache HBase • HBase at StumbleUpon • Overview of other use cases 2 Tuesday, June 18, 13 About le Moi • At Cloudera since October 2012. • At StumbleUpon for 3 years before that. • Committer and PMC member for Apache HBase since 2008. • Living in San Francisco. • From Québec, Canada. 3 Tuesday, June 18, 13 4 Tuesday, June 18, 13 What is Apache HBase Apache HBase is an open source distributed scalable consistent low latency random access non-relational database built on Apache Hadoop 5 Tuesday, June 18, 13 Inspiration: Google BigTable (2006) • Goal: Low latency, consistent, random read/ write access to massive amounts of structured data. • It was the data store for Google’s crawler web table, gmail, analytics, earth, blogger, … 6 Tuesday, June 18, 13 HBase is in Production • Inbox • Storage • Web • Search • Analytics • Monitoring 7 Tuesday, June 18, 13 HBase is in Production • Inbox • Storage • Web • Search • Analytics • Monitoring 7 Tuesday, June 18, 13 HBase is in Production • Inbox • Storage • Web • Search • Analytics • Monitoring 7 Tuesday, June 18, 13 HBase is in Production • Inbox • Storage • Web • Search • Analytics • Monitoring 7 Tuesday, June 18, 13 HBase is Open Source • Apache 2.0 License • A Community project with committers and contributors from diverse organizations: • Facebook, Cloudera, Salesforce.com, Huawei, eBay, HortonWorks, Intel, Twitter … • Code license means anyone can modify and use the code. 8 Tuesday, June 18, 13 So why use HBase? 9 Tuesday, June 18, 13 10 Tuesday, June 18, 13 Old School Scaling => • Find a scaling problem. • Beef up the machine. • Repeat until you cannot find a big enough machine or run out of funding. 11 Tuesday, June 18, 13 “Get Rid of Everything” Scaling • Remove text search queries (LIKE) • Remove joins • Joins due to Normalization require expensive seeks • Remove foreign keys and encode your relations • Avoid constraint checks • Put all parts of a query in a single table. • Use read slaves to scale reads. • Shard to scale writes. 12 Tuesday, June 18, 13 We “optimized the DB” by discarding some fundamental SQL/relational databases features. 13 Tuesday, June 18, 13 HBase is Horizontally Scalable • Adding more servers linearly increases performance and capacity • Storage capacity • • Input/output operations Largest cluster: >1000 nodes, >1PB • Store and access data on 1-1000’s • Most clusters: commodity servers 10-40 nodes, 100GB-4TB 14 Tuesday, June 18, 13 HBase is Consistent • Brewer’s CAP theorem • Consistency: • DB-style ACID guarantees on rows • Availability: • Favor recovering from faults over returning stale data • Partition Tolerance: • If a node goes down, the system continues. 15 Tuesday, June 18, 13 HBase Dependencies • Apache Hadoop HDFS for data durability and reliability (Write- Ahead Log). • Apache ZooKeeper for distributed App MR coordination. • Apache Hadoop MapReduce support built-in support for ZK HDFS running MapReduce jobs. 16 Tuesday, June 18, 13 HBase on a Cluster 17 Tuesday, June 18, 13 Tables and Regions 18 Tuesday, June 18, 13 Load Distribution RegionServer RegionServer Region Region Region 19 Tuesday, June 18, 13 Load Distribution This region is getting too big and afects the balancing (more about writing in a moment) RegionServer RegionServer Region Region Region 20 Tuesday, June 18, 13 Load Distribution Let’s split the region in order to split the load RegionServer RegionServer Region Region Region 21 Tuesday, June 18, 13 Load Distribution RegionServer RegionServer Region Region A Region Region B Now that we have smaller pieces, it’s easier to move the load around 22 Tuesday, June 18, 13 Load Distribution RegionServer RegionServer Region Region A Region B Region No data was actually moved during this process, only its responsibility! 23 Tuesday, June 18, 13 The region is the unit of load distribution in HBase. 24 Tuesday, June 18, 13 So HBase can scale, but what about Hadoop? 25 Tuesday, June 18, 13 HDFS Data Allocation Client Locations? Name node 26 Tuesday, June 18, 13 HDFS Data Allocation Data Data Data Client node node node Here you go Name node 27 Tuesday, June 18, 13 HDFS Data Allocation Data is sent along the pipeline Data Data Data Client node node node Name node 28 Tuesday, June 18, 13 HDFS Data Allocation Data Data Data Client node node node ACKs are sent back as soon as the Name data is in memory in the last node node 29 Tuesday, June 18, 13 Putting it Together Machine Region Data Data Data server node node node Name node 30 Tuesday, June 18, 13 Data locality is extremely important for Hadoop and HBase. 31 Tuesday, June 18, 13 Scaling is just a matter of adding new nodes to the cluster. 32 Tuesday, June 18, 13 Sorted Map Datastore 33 Tuesday, June 18, 13 Sorted Map Datastore Implicit PRIMARY KEY Row key info:height info:state roles:hadoop roles:hbase cutting ‘9ft’ ‘CA’ ‘Founder’ ‘PMC’ @ts=2011 tlipcon ‘5ft7’ ‘CA’ ‘Commiter’ ‘Committer’ @ts=2010 34 Tuesday, June 18, 13 Sorted Map Datastore Implicit PRIMARY KEY Format is family:qualifier Row key info:height info:state roles:hadoop roles:hbase cutting ‘9ft’ ‘CA’ ‘Founder’ ‘PMC’ @ts=2011 tlipcon ‘5ft7’ ‘CA’ ‘Commiter’ ‘Committer’ @ts=2010 35 Tuesday, June 18, 13 Sorted Map Datastore Implicit PRIMARY KEY Format is family:qualifier Data is all byte[] in HBase Row key info:height info:state roles:hadoop roles:hbase cutting ‘9ft’ ‘CA’ ‘Founder’ ‘PMC’ @ts=2011 tlipcon ‘5ft7’ ‘CA’ ‘Commiter’ ‘Committer’ @ts=2010 36 Tuesday, June 18, 13 Sorted Map Datastore Implicit PRIMARY KEY Format is family:qualifier Data is all byte[] in HBase Row key info:height info:state roles:hadoop roles:hbase cutting ‘9ft’ ‘CA’ ‘Founder’ ‘PMC’ @ts=2011 tlipcon ‘5ft7’ ‘CA’ ‘Commiter’ ‘Committer’ @ts=2010 A single cell might have diferent values at diferent timestamps 37 Tuesday, June 18, 13 Sorted Map Datastore Implicit PRIMARY KEY Format is family:qualifier Data is all byte[] in HBase Row key info:height info:state roles:hadoop roles:hbase cutting ‘9ft’ ‘CA’ ‘Founder’ ‘PMC’ @ts=2011 tlipcon ‘5ft7’ ‘CA’ ‘Commiter’ ‘Committer’ @ts=2010 A single cell might have Diferent rows may have diferent values at diferent sets of columns diferent timestamps (table is sparse) 38 Tuesday, June 18, 13 Anatomy of a row • Each row has a primary key • Lexicographically sorted byte[] • Timestamp associated for keeping multiple versions of data (MVCC for consistency) • Row is made up of columns. • Each (row,column) referred to as a Cell • Contents of a cell are all byte[]’s. • Apps must “know” types and handle them. • Rows are Strongly consistent. 39 Tuesday, June 18, 13 Access HBase data via an API • Data operations • Get • Put • Delete • Scan • Compare-and-swap • DDL operations • Create • Alter • Enable/Disable • Access via HBase shell, Java API, REST proxy 40 Tuesday, June 18, 13 Java API HBase provides utilities for easy conversions byte[] row = Bytes.toBytes(“jdcryans”); byte[] fam = Bytes.toBytes(“roles”); byte[] qual = Bytes.toBytes(“hbase”); byte[] putVal = Bytes.toBytes(“PMC”); Configuration config = HBaseConfiguration.create(); HTable table = new HTable(config, “employees”); Put p = new Put(row); p.add(fam, qual, putVal) table.put(p); Get g = new Get(row); Result r = table.get(g); byte[] jd = r.getValue(col); 41 Tuesday, June 18, 13 Java API HBase provides utilities for easy conversions byte[] row = Bytes.toBytes(“jdcryans”); byte[] fam = Bytes.toBytes(“roles”); byte[] qual = Bytes.toBytes(“hbase”); byte[] putVal = Bytes.toBytes(“PMC”); This reads the configuration files Configuration config = HBaseConfiguration.create(); HTable table = new HTable(config, “employees”); Put p = new Put(row); p.add(fam, qual, putVal) table.put(p); Get g = new Get(row); Result r = table.get(g); byte[] jd = r.getValue(col); 42 Tuesday, June 18, 13 Java API HBase provides utilities for easy conversions byte[] row = Bytes.toBytes(“jdcryans”); byte[] fam = Bytes.toBytes(“roles”); byte[] qual = Bytes.toBytes(“hbase”); byte[] putVal = Bytes.toBytes(“PMC”); This reads the configuration files Configuration config = HBaseConfiguration.create(); HTable table = new HTable(config, “employees”); Put p = new Put(row); This creates a connection to p.add(fam, qual, putVal) the cluster, no master needed table.put(p); Get g = new Get(row); Result r = table.get(g); byte[] jd = r.getValue(col); 43 Tuesday, June 18, 13 Java API HBase provides utilities for easy conversions byte[] row = Bytes.toBytes(“jdcryans”); byte[] fam = Bytes.toBytes(“roles”); byte[] qual = Bytes.toBytes(“hbase”); byte[] putVal = Bytes.toBytes(“PMC”); This reads the configuration files Configuration config = HBaseConfiguration.create(); HTable table = new HTable(config, “employees”); Put p = new Put(row); This creates a connection to p.add(fam, qual, putVal) the cluster, no master needed table.put(p); By default all operations are persisted Get g = new Get(row); Result r = table.get(g); byte[] jd = r.getValue(col); 44 Tuesday, June 18, 13 Java API HBase provides utilities for easy conversions byte[] row = Bytes.toBytes(“jdcryans”); byte[] fam = Bytes.toBytes(“roles”); byte[] qual = Bytes.toBytes(“hbase”); byte[] putVal = Bytes.toBytes(“PMC”); This reads the configuration files Configuration config = HBaseConfiguration.create(); HTable table = new HTable(config, “employees”); Put p = new Put(row); This creates a connection to p.add(fam, qual, putVal) the cluster, no master needed table.put(p); By default all operations are persisted Get g = new Get(row); Result r = table.get(g);