Google's Mission
Total Page:16
File Type:pdf, Size:1020Kb
& Big Data & Rocket Fuel Dr Raj Subramani, HSBC Reza Rokni, Google Cloud, Solutions Architect Adrian Poole, Google Cloud, Google’s Mission Organize the world’s information and make it universally accessible and useful Eight cloud products with ONE BILLION Users Increasing Marginal Cost of Change $ Traditional Architectures Prohibitively Expensive change Marginal cost of 18 years of Google R&D / Investment Google Cloud Native Architectures (GCP) Increasing complexity of systems and processes Containers at Google Number of running jobs Enabled Google to grow our fleet over 10x faster than we grew our ops team Core Ops Team 2004 2016 4 Google’s innovation in data Millwheel F1 Spanner TensorFlow MapReduce Dremel Flume GFS Bigtable Colossus Megastore Pub/Sub Dataflow 2002 2004 2006 2008 2010 2012 2013 2016 Proprietary + Confidential5 Google’s innovation in data Dataflow Spanner NoSQL Spanner Cloud ML Dataproc BigQuery Dataflow GCS Bigtable GCS Datastore Pub/Sub Dataflow 2002 2004 2006 2008 2010 2012 2013 2016 Proprietary + Confidential6 Now available on Google Cloud Platform Compute Storage & Databases App Engine Container Compute Storage Bigtable Spanner Cloud SQL Datastore Engine Engine Big Data Machine Learning BigQuery Pub/Sub Dataflow Dataproc Datalab Vision API Machine Speech API Translate API Learning Lesson of the last 10 years... ● Democratise ML ● Big datasets beat fancy algorithms ● Good Models ● Lots of compute Google BigQuery BigQuery is Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics. BigQuery is serverless. There is no infrastructure to manage and you don't need a database administrator, so you can focus on analyzing data to find meaningful insights using familiar SQL. BigQuery is a powerful Big Data analytics platform used by all types of organizations, from startups to Fortune 500 companies. Convenient: Mb -> Pb Scale and Fast Convenience of SQL Secure: Encrypted, Durable and Highly Available Simple: Fully Managed and Serverless What is Cloud Dataflow? Unified batch and streaming processing Fully managed, no-ops data processing Open source programming model Intelligently scales to millions of QPS Google Cloud Dataflow Confidential + Proprietary Big Data at HSBC Scale Dr Raj Subramani, HSBC Fundamental Review of the Trading Book Fundamental Review of the Trading Book (FRTB) ● Basel Committee on Banking Supervision (BCBS) conducted two assessments (The Regulatory Consistency Assessment Programme - February and December 2013) for capital charges of market risks in trading books for institutions with approved internal models ● The significant differences in capital charges confirmed that the market risk framework was in need for reform The regulations, in their final form, were published in January 2016 National supervisors are expected to finalize implementation by January 2019 Banks are expected to report under the new standards by end of 2019 Fundamental Review of the Trading Book Trading Book and Banking Book Boundary Relationship Treatment between Of Credit Internal Model (securitised v/s (IM) and non-securitised) Standardized Approach (SA) FRTB Approach To Risk Treatment of Management Hedging and (VaR to Expected Diversification Shortfall) Incorporation of liquidity horizons Working in the Cloud – the tradeoffs Technology Public Cloud outcomes Risks ● Business focused IT solution ● Data security risks ● Access to latest technology ● Lock-in risks ● Rapid prototyping ● Third party dependency risks ● Quicker time to market ● Reduced capacity lag ● Internal Security clearance ● Scalability and performance ● Regulatory approval ● Reduced total cost of ownership ● Data sharing across borders ● Geo-political issues Cost Governance Outcomes Risks Cloud Dataflow Resource management S Graph optimization Compute and storage Intelligent watermarking O S Work scheduler Auto-healing U Unbounded I R Resource auto-scaler Monitoring N C Dynamic work K Log collection E rebalancer Bounded Proprietary + Confidential The Anatomy of a Risk Engine Data distribution Trade & Market Data and workflow across Transferred to the the analytics Cloud (batch or stream) Market Data Store results Post-process Storage Analytics Bounded Post Pub/Sub Processing Unbounded BigQuery Dataflow Trade Data Dataflow as Risk Engine - Scale and Performance ● 2 million (dummy) plain vanilla mono currency interest rate swaps in 12 currencies ● Dummy interest rate market data build from Bond, Futures and Swaps ● Analytics was open source Quantlib (C++ compiled on Linux) Dataflow as Risk Engine - Stateful Analytics JVM running C++ ● Performance gains are not always obtained straight out of the box ● Application of domain knowledge and expertise will always help tease out the best desired performance The Cloud Journey • Bring the business problem not a technical solution • Beware the frog in the well • Big Data in Google is just data; the separation of the data from the processing, in Google, allows for clever combinations to address both scenarios What next ? • Sign up for a Google Cloud account - first $300 free ! • Google Cloud courses @ https://www.coursera.org/ including Qwiklabs • Contact Ian O’Shea ( [email protected] ) for further info. & Thank you Dr Raj Subramani, HSBC Reza Rokni, Google Cloud, Solutions Architect Adrian Poole, Google Cloud, Financial Services.