Optimizing Application and Microservice Performance

Total Page:16

File Type:pdf, Size:1020Kb

Optimizing Application and Microservice Performance OPTIMIZING APPLICATION AND MICROSERVICE PERFORMANCE: Enhancing Oracle, MySQL, PostgreSQL, and Other Relational Databases with Replication and a Smart Cache By Irina Rimecode WHERE IS YOUR DATA GROWTH? T A B L E o f CONTENTS IN MICROSERVICES WE TRUST 1 OR DON'T WE? 2 PICK A PILL NEO 3 SPLENDORS AND MISERIES OF MODERN DBMSS 4 SOLVING PITFALLS WITH TARANTOOL 5 LET'S CHECK THE NUMBERS 6 TESTING MYSQL, POSTGRESSGL AND TARANTOOL 1 0 S O W H A T ? 1 3 Database management systems (DBMSes) like Oracle, PostgreSQL, MySQL, DB2 and Microsoft SQL Server are often responsible for key application and microservice performance pitfalls. Smart caches can solve some of these issues. Tarantool, which combines a smart cache, an application server, and a full disk and in-memory data grid, can effectively optimize existing relational architectures. IN MICROSERVICES WE TRUST Let's recall some of the reasons why we love microservices: The microservice architecture rejects the monolithic application concept. Instead of executing all limited contexts on a single server using interprocess communication, several small applications work in tandem, each corresponding to a limited context. As a rule, microservices run on different servers and interact over the network, so the microservice application can be viewed as a kind of distributed system. Microservices provide the comfort of not putting all of your “eggs of code” into one basket: services can be created and managed by various teams in many programming languages, and these teams have the flexibility to choose a variety of storage technologies. This simplifies both development and maintenance compared to a monolithic app, because even small changes in the latter require reassembly and deployment. Microservices also improve scalability compared to monolithic apps, because they don’t have to be entirely scaled each time a new module is needed. 1 ...OR DON"T WE? With all of these unquestionable benefits, though, we should consider the ways monolithic and microservice apps treat their databases. As regards the former, the application deals with a single database, which accesses functional components directly, and can read data belonging to other components. All components possess the same degree of data retention and integrity. In the microservice system, on the other hand, components work with their own databases (which are not directly accessible to other microservices). So the "my database, not your database" problem arises: that is, microservice data is available for reading and writing only through interfaces. This can cause the degree of data integrity to vary. Often while the system is running, the data of a single service provider (a “supplier”) is copied in full or in part by another microservice (a “client”), and if it changes, the supplier initiates an event to update the data copied by the client. The event drops into the message queue and waits for the client to receive and process it. At this point, the data from the supplier and the client is inconsistent, although eventually the changes will be applied to all copies. This is known as “eventual consistency” and it must be taken into account when microservice systems are developed. Decentralization in microservice systems dictates the order. 2 "PICK A PILL NEO" The common app pill. A conventional transaction-based approach guarantees consistency but leads to a significant temporal coupling and creates problems when dealing with multiple services. The microservice pill. Distributed transactions are very difficult to implement. The microservice architecture attempts to coordinate between services without transactions by explicitly assuming that consistency can only be eventual, and that any problems that arise will be solved by compensation operations. Another point to keep in mind is that synchronous calls between services increase downtime. In microservice systems, downtime is the product of the idle time of individual components interacting with each other. There are two options: either make calls asynchronous, or boost your zen and let the downtime be. But the main issue is yet to come. 3 SPLENDORS AND MISERIES OF MODERN DBMSS Let's consider some known headaches related to DBMSes before we switch to testing. Relational DBMSes lack the sweet properties of cache databases: namely high speed, low latency, and horizontal scaling. They are not accidentally called “disk” databases: they must interact with databases stored on disk and this significantly affects their access speed. In addition, the secondary keys, complex queries, and stored procedures of relational DBMSes poorly scale when using large clusters of computers. You can't simply add cluster nodes as needed. Easily scaled, horizontally relational DBMSes do exist, for example NonStop SQL or Teradata, but they are expensive, require advanced hardware, and are challenging to integrate into a diverse environment of data systems. Relational DBMSes are designed for a predefined data schema and ad hoc data requests. You can write arbitrary SELECT statements, arbitrary JOIN statements, etc., but — oops! — in microservice systems, it is usually the other way around. Their data schema transforms dynamically while the queries are more or less fixed — they change along with the data schema. A more cost effective solution than RDBMSes can be open, non-relational (NoSQL) databases. But they also have their drawbacks, specifically the lack of transactions. If a cache is used in combination with a traditional DBMS, for example MySQL plus Memcached or PostgreSQL plus Redis, then you can say goodbye to not only transactions, but also stored procedures and secondary indexes. Some cache properties are also lost, for example write capacity is reduced, and new problems arise, including inconsistency of data and cold starts. 4 SOLVING PITFALLS WITH TARANTOOL For the microservice system, all of the advantages of both cache databases and relational databases are needed simultaneously, and Tarantool claims to provide this solution. Tarantool runs in memory and stores two files on disk: a snapshot of data at a certain point in time, plus a log of all transactions. Its basic storage element is the tuple. A tuple can be made up of any number of dimensions, it's just an arbitrarily long list of fields each associated with a unique key. Each tuple belongs to a space and indexes can be defined on tuple fields. If you’d like to make analogies to a relational DBMS, "space" corresponds to table, and "fields" correspond to columns. One thing Tarantool clearly fixes is the “cold start” problem. Think about the way in which MySQL and PostgreSQL integrate with cache databases: in an application, Postgres and MySQL do not respond until everything is loaded into memory but their cache counterparts warm up much more slowly (1-2 Mbps), and therefore you need to use various hacks, like prewarming the index, to keep the two in sync (those who administer MySQL know this better than their pets’ names). As for Tarantool, it just gets installed and runs perfectly upon start. The cold start time is as short as possible. “Why Not Just Use Redis” — You Say? Tarantool’s primary idea is to quickly process large amounts of data in memory using something more than just the key/value and other data structures that Redis provides (Tarantool’s data model is closer to the MongoDB type, not just "data structures”). Just imagine that you have dozens or hundreds of gigabytes of live data: Tarantool gives you the opportunity to do something really complex and sophisticated with them. Tarantool also maintains transactions: you get normal begin, commit, and rollback in stored procedures, and it has secondary indexes that are updated automatically, consistently, and atomically. Finally, Tarantool simply uses less memory than Redis. 5 LET'S CHECK THE NUMBERS I am an empirical person that likes to test before I make choices. To run my tests, I used two entry-level VPSes, because low-power VPSes tend to reveal flaws better than more powerful ones (the weaker the machine, the clearer the differences will be between Tarantool and its alternatives). The first machine is the server and the second, the client. I tested two traditional databases, MySQL and PostgreSQL, as well as Tarantool. Local tests of CPU, RAM and the disk subsystem were performed using the utility sysbench. TEST VPS CONFIGURATION Local testing of sabbakka-1 and sabbakka-2 computing powers (CPU + RAM and disk subsystem performance) is performed using sysbench. 1. To test the CPU, we were calculating twenty thousand prime numbers. The calculation was performed in one thread by default. For parallel computations, you need to use the --num-threads=N switch. $ sysbench --test=cpu --cpu-max-prime=20000 run 6 CPU TEST RESULTS The most peculiar thing here is the total time because the machine with less RAM executed faster. (This test makes sense to run on servers that have different computing power; on same-characteristic machines the results will be more or less the same). The results obviously don't depend on the RAM amount but are probably attributable to the nuances of the provider's cloud infrastructure, because I used a virtual machine for the server and a container for the client. 7 2. Next I tested the disk subsystem of the server, sabbakka-1 (this test, related to I/O, only makes sense to run on the server). I executed three steps: (a) Generated a set of test files with the command: $ sysbench --test=fileio --file-total-size=5G prepare. As a result, I received files with a total capacity of 5GB (several times bigger than the RAM so that the operating system cache did not affect the results). (b) Carried out testing: $ sysbench --test=fileio --file-total-size=5G --file-test-mode=rndrw --init-rng=on -- max-time=300 --max-requests=0 run The test ran in random read/write mode (rndrw) for 300 seconds, and testing was performed in one thread (“Number of threads: 1”) by default.
Recommended publications
  • Tarantool Enterprise Manual Release 1.10.4-1
    Tarantool Enterprise manual Release 1.10.4-1 Mail.Ru, Tarantool team Sep 28, 2021 Contents 1 Setup 1 1.1 System requirements.........................................1 1.2 Package contents...........................................2 1.3 Installation..............................................3 2 Developer’s guide 4 2.1 Implementing LDAP authorization in the web interface.....................5 2.2 Delivering environment-independent applications.........................5 2.3 Running sample applications....................................8 3 Cluster administrator’s guide 11 3.1 Exploring spaces........................................... 11 3.2 Upgrading in production...................................... 13 4 Security hardening guide 15 4.1 Built-in security features...................................... 15 4.2 Recommendations on security hardening............................. 17 5 Security audit 18 5.1 Encryption of external iproto traffic................................ 18 5.2 Closed iproto ports......................................... 18 5.3 HTTPS connection termination.................................. 18 5.4 Closed HTTP ports......................................... 19 5.5 Restricted access to the administrative console.......................... 19 5.6 Limiting the guest user....................................... 19 5.7 Authorization in the web UI.................................... 19 5.8 Running under the tarantool user................................. 20 5.9 Limiting access to the tarantool user...............................
    [Show full text]
  • Building Large Tarantool Cluster with 100+ Nodes Yaroslav Dynnikov
    Building large Tarantool cluster with 100+ nodes Yaroslav Dynnikov Tarantool, Mail.Ru Group 10 October 2019 Slides: rosik.github.io/2019-bigdatadays 1 / 40 Tarantool = + Database Application server (Lua) (Transactions, WAL) (Business logics, HTTP) 2 / 40 Core team 20 C developers Product development Solution team 35 Lua developers Commertial projects 3 / 40 Core team 20 C developers Product development Solution team 35 Lua developers Commertial projects Common goals Make development fast and reliable 4 / 40 In-memory no-SQL Not only in-memory: vinyl disk engine Supports SQL (since v.2) 5 / 40 In-memory no-SQL Not only in-memory: vinyl disk engine Supports SQL (since v.2) But We need scaling (horizontal) 6 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 7 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 8 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 9 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 10 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 11 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 12 / 40 Vshard configuration Lua tables sharding_cfg = { ['cbf06940-0790-498b-948d-042b62cf3d29'] = { replicas = { ... }, }, ['ac522f65-aa94-4134-9f64-51ee384f1a54'] = { replicas = { ... }, }, } vshard.router.cfg(...) vshard.storage.cfg(...) 13 / 40 Vshard automation. Options Deployment scripts Docker compose Zookeeper 14 / 40 Vshard automation.
    [Show full text]
  • LIST of NOSQL DATABASES [Currently 150]
    Your Ultimate Guide to the Non - Relational Universe! [the best selected nosql link Archive in the web] ...never miss a conceptual article again... News Feed covering all changes here! NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable. The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly. Often more characteristics apply such as: schema-free, easy replication support, simple API, eventually consistent / BASE (not ACID), a huge amount of data and more. So the misleading term "nosql" (the community now translates it mostly with "not only sql") should be seen as an alias to something like the definition above. [based on 7 sources, 14 constructive feedback emails (thanks!) and 1 disliking comment . Agree / Disagree? Tell me so! By the way: this is a strong definition and it is out there here since 2009!] LIST OF NOSQL DATABASES [currently 150] Core NoSQL Systems: [Mostly originated out of a Web 2.0 need] Wide Column Store / Column Families Hadoop / HBase API: Java / any writer, Protocol: any write call, Query Method: MapReduce Java / any exec, Replication: HDFS Replication, Written in: Java, Concurrency: ?, Misc: Links: 3 Books [1, 2, 3] Cassandra massively scalable, partitioned row store, masterless architecture, linear scale performance, no single points of failure, read/write support across multiple data centers & cloud availability zones. API / Query Method: CQL and Thrift, replication: peer-to-peer, written in: Java, Concurrency: tunable consistency, Misc: built-in data compression, MapReduce support, primary/secondary indexes, security features.
    [Show full text]
  • SCHEDULE SCHEDULE Thanks to Our Sponsors
    CONFERENCE CONFERENCE AND TUTORIAL AND TUTORIAL SCHEDULE SCHEDULE Thanks to our sponsors: This is the interactive guide to Percona Live Europe 2019. Links are clickable, including links back to the timetables at the bottom of the pages. You can register at www.percona.com/live-registration Sections Daily Schedules Talks by Technology Keynotes Monday Tutorials Tuesday Talks Wednesday Talks Speakers Spot talks by technology in the timetable: a color key MySQL MongoDB PostgreSQL Other Databases, Multiple Databases, and Other Topics TUTORIALS DAY MONDAY, SEPTEMBER 30 ROOM ROOM ROOM ROOM A B 7 26 second floor PostgreSQL For Oracle and Accelerating Application MySQL 8.0 InnoDB Cluster: 9:00 MySQL DBAs and MySQL 101 Tutorial Part 1 Development with 9:00 Easiest Tutorial! For Beginners Amazon Aurora 12:00 LUNCH 12:00 Innodb Architecture and Accelerating Application Introduction to 1:30 Performance Optimization MySQL 101 Tutorial Part 2 Development with 1:30 PL/pgSQL Development Tutorial for MySQL 8 Amazon Aurora 4:30 WELCOME RECEPTION – EXPO HALL OPEN UNTIL 6PM 4:30 ROOM ROOM ROOM 8 9 10 Open Source Database MariaDB Server 10.4: Percona XtraDB Cluster 9:00 Performance Optimization 9:00 The Complete Tutorial Tutorial and Monitoring with PMM 12:00 LUNCH 12:00 Test Like a Boss: Deploy and A Journey with MongoDB HA. Getting Started with 1:30 Test Complex Topologies From Standalone to Kubernetes and Percona 1:30 With a Single Command Kubernetes Operator XtraDB Cluster 4:30 WELCOME RECEPTION – EXPO HALL OPEN UNTIL 6PM 4:30 TUESDAY, OCTOBER 1 ROOM Building
    [Show full text]
  • Alexander Turenko
    Alexander Turenko • Date of birth: 1992-04-28. • Education: Lomonosov Moscow State University, Faculty of Computational Mathematics and Cybernetics, 2009{2015. • Jobs (from older to more recent): • Tarantool DBMS, March, 2017 |Now. • Tarantool Server Team. Working on tarantool itself, modules and connectors. Relatively valuable tasks were graphql module and merger builtin module. • Solution Engineering Team. A point-to-point money transfer service for banks. The core idea is to provide an API and a service to perform money transfers using a phone number or an other ID. Developed from a first line of code to production (started as the single member, then working in a team of three members). • Intel, compilers performance analysis, March, 2015 | July, 2016. • Analyzing of code performance / code size / compile time degradations. • Creating and supporting perf. analysis tools. • Got C/C++ compilers background, touched LLVM/Clang. • Notable open-source projects (out of paid work): • BombusMod. J2ME & Android Jabber client. Integrating juick.com microservice, improving hotkeys, lots of small improvements. • Tkabber plugins. Desktop Jabber client written in Tcl/Tk. Juick and notes plugins. • Whatifrussian. Translations of Randall's Monroe popular scientific articles. Proof-reading and translating, Python & JS programming, contributing a bit to projects we use (Pelican, Zepto). • Area of interest: System programming. • Programming languages: • C, Lua, Java (J2SE & J2ME), C++ (mostly C++03 w/o STL), x86 assembler (Intel and AT&T syntax both), sh (and bash), Python, Tcl/Tk. • Fuzzy experience with: Pascal, GLSL, Refal5, SWI Prolog, Lisp, Haskell, PHP, JavaScript, OpenMP, MPI. • Human languages: • Russian | native. • English | intermediate (reading and writing documentation, technical discussions).
    [Show full text]
  • Tarantool Team's Experience with Lua Developer Tools
    03/03/2019 Tarantool team's experience with Lua developer tools Tarantool team's experience with Lua developer tools Yaroslav Dynnikov Tarantool, Mail.Ru Group 3 March 2019 http://localhost:8000/#1 11 //25 25 03/03/2019 Tarantool team's experience with Lua developer tools Tarantool Tarantool is an open-source data integration platform Tarantool = Database + Application server (Lua) http://localhost:8000/#1 22 //25 25 03/03/2019 Tarantool team's experience with Lua developer tools Tarantool Tarantool is an open-source data integration platform Tarantool = Database + Application server (Lua) Core team Focuses on the product development Solution team Implements projects for the Enterprise http://localhost:8000/#1 33 //25 25 03/03/2019 Tarantool team's experience with Lua developer tools Tarantool Solution Engineering 35 Lua developers ~ 50 Git repos ~ 300,000 SLoC Customers IT, Banking, Telecom, Oil & Gas Goals Develop projects fast and well http://localhost:8000/#1 44 //25 25 03/03/2019 Tarantool team's experience with Lua developer tools Writing better code http://localhost:8000/#1 55 //25 25 03/03/2019 Tarantool team's experience with Lua developer tools Development The sooner code fails - the better - Runtime checks local function handle_stat_request(req) local stat = get_stat() return { status = 200, body = json.encode(stet), } end handle_stat_request() -- returns: "null" http://localhost:8000/#1 66 //25 25 03/03/2019 Tarantool team's experience with Lua developer tools Development The sooner code fails - the better - Runtime checks
    [Show full text]
  • 3 Generic Programming with Mutually Recursive Types 27 3.1 the Generics-Mrsop Library
    2 Type-Safe Generic Differencing of Mutually Recursive Families Getypeerde Generieke Differentiatie van Wederzijds Recursieve Datatypes (met een samenvatting in het Nederlands) Proefschrift ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van de rector magnificus, prof. dr. H.R.B.M. Kummeling, ingevolge het besluit van het college voor promoties in het openbaar te verdedigen op maandag 5 oktober 2020 des ochtends te 11:00 uur door Victor Cacciari Miraldo geboren op 16 oktober 1991 te São Paulo, Brazilië Promotor: Prof.dr. G.K. Keller Copromotor: Dr. W.S. Swierstra Dit proefschrift werd (mede) mogelijk gemaakt met financiële steun van de Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO), project Revision Control of Structured Data (612.001.401). To my Mother, Father and Brother Abstract The UNIX diff tool – which computes the differences between two files in terms ofa set of copied lines – is widely used in software version control. The fixed lines-of-code granularity, however, is sometimes too coarse and obscures simple changes, i.e., renam- ing a single parameter triggers the whole line to be seen as changed. This may lead to unnecessary conflicts when unrelated changes occur on the same line. Consequently, it is difficult to merge such changes automatically. In this thesis we discuss two novel approaches to structural differencing, generically – which work over a large class of datatypes. The first approach defines a type-indexed representation of patches and provides a clear merging algorithm, but it is computation- ally expensive to produce patches with this approach. The second approach addresses the efficiency problem by choosing an extensional representation for patches.
    [Show full text]
  • Tarantool Get Started Guide Draft 8
    L A U N C H Y O U R O W N R E S T F U L S E R V I C E I N 9 EASY STEPS Q U I C K S T A R T G U I D E 1 Tarantool has the great advantage of being an entirely self-sufficient backend solution, thus no external language like Node or PHP is necessary. This is because it includes a Lua application server that runs concurrent to its database server. In this tutorial, you will set up a basic Tarantool server on Ubuntu Linux using Digital Ocean. This responds to a route parameter with a user’s home city and returns that user’s name with a hello message. With some adjustments, this can also be accomplished on localhost using OSX, Linux, or Windows with WSL. S T A R T H E R E Sign up for Digital Ocean and create a basic Ubuntu droplet. Once you have its IP address, use SSH to access your instance. If you need help with setting up or 1. accessing your droplet, have a look at the beginning of the tutorial here. Once you have logged in, copy the script from https://tarantool.org/en/download/os-installation/1.7/ubuntu.html and paste it into 2. your terminal all at once. You may need to press enter again for the last line. After that we use apt-get to download an add-on Tarantool http module: 3. sudo apt-get install tarantool-http www.tarantool.io 1 Next we will set up an empty Tarantool database.
    [Show full text]
  • FOSDEM 2020 Schedule
    FOSDEM 2020 - Saturday 2020-02-01 (1/15) Janson K.1.105 (La H.2215 (Ferrer) H.1302 (Depage) H.1308 (Rolin) H.1309 (Van Rijn) H.2213 H.2214 Fontaine)… 09:30 Welcome to FOSDEM 2020 09:45 10:00 The Linux Kernel: We How FOSS could have to finish this thing revolutionize municipal one day ;) government 10:15 10:30 State of OpenJDK Fundamental DNS Devroom Opening Designing and Welcome to the MySQL, Technologies We Need DNS Management in Producing Open Source MariaDB & Friends D… to Work on for Cloud- OpenStack Hardware with MySQL 8 vs MariaDB Native Networking FOSS/OSHW tools 10:45 10.4 LibrePCB Status Update 11:00 LibreOffice turns ten The Selfish Contributor and what's next Explained Skydive HashDNS and MyRocks in the Wild 11:15 FQDNDHCP Wild West! Project Loom: Advanced Open-source design concurrency for fun and ecosystems around 11:30 profit Do you really see what’s FreeCAD happening on your NFV infrastructure? How Safe is 11:45 State of djbdnscurve6 Asynchronous Master- TornadoVM: A Virtual Master Setup? Machine for Exploiting ngspice open source High-Performance 12:00 Over Twenty Years Of The Ethics Behind Your Civil society needs Free circuit simulator Heterogeneous Automation IoT Software hackers Execution of Java Programs Endless Network Testing DoH and DoT The consequences of 12:15 Programming − An servers, compliance and sync_binlog != 1 Update from eBPF Land performance A tool for Community ByteBuffers are dead, Towards CadQuery 2.0 Supported Agriculture long live ByteBuffers! ↴ 12:30 (CSA) management, … Replacing iptables with eBPF
    [Show full text]
  • Watcher Release 0.2.0
    watcher Release 0.2.0 Raciel Hernandez B. May 25, 2021 FIRST STEPS 1 First steps 3 1.1 Prerequisites...............................................3 1.2 Quick Instalation.............................................4 1.3 Watcher features.............................................5 2 Getting started with Watcher 17 3 Step-by-step Guides 19 4 Advanced features of Watcher 21 5 Watcher project and organization 23 i ii watcher, Release 0.2.0 Watcher simplifies the integration of non-connected systems by detecting changes in data and facilitates thedevel- opment of monitoring, security and process automation applications. Think of Watcher as an intercom or a bridge between different servers or between different applications on the same server. Or you can simply take advantage of Watcher’s capabilities to develop your project. “Watch everything” Currently the functionality of detecting changes in the file system is implemented. However, the project has a larger scope and we invite you to collaborate with us to achieve the goal of “Watch Everything”. One step at a time! Come on and join us. Starting with the file system Yes, we have started implementing watcher to observe and detect changes in the file system. You can use watcher to discover changes related to file creation, file deletion and file alteration. Youcan find out more about our all the Watcher features in these pages. Watcher is Free, Open Source and User Focused Our code is free and open source. We like open source but we like socially responsible software even more. Watcher is distributed under MIT license. FIRST STEPS 1 watcher, Release 0.2.0 2 FIRST STEPS CHAPTER ONE FIRST STEPS Your project needs to process inputs that trigger your business logic but those inputs are out of your control? Do you want to integrate your project based on detection of file system changes? Learn about the great options Watcher offers for advanced change detection that you can leverage for your project development.
    [Show full text]
  • Storage Solutions for Big Data Systems: a Qualitative Study and Comparison
    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison Samiya Khana,1, Xiufeng Liub, Syed Arshad Alia, Mansaf Alama,2 aJamia Millia Islamia, New Delhi, India bTechnical University of Denmark, Denmark Highlights Provides a classification of NoSQL solutions on the basis of their supported data model, which may be data- oriented, graph, key-value or wide-column. Performs feature analysis of 80 NoSQL solutions in view of technology selection criteria for big data systems along with a cumulative evaluation of appropriate and inappropriate use cases for each of the data model. Classifies big data file formats into five categories namely text-based, row-based, column-based, in-memory and data storage services. Compares available data file formats, analyzing benefits and shortcomings, and use cases for each data file format. Evaluates the challenges associated with shift of next-generation big data storage towards decentralized storage and blockchain technologies. Abstract Big data systems‘ development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems‘ design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real-world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model.
    [Show full text]
  • Использование Tarantool В .NET-Проектах Анатолий Попов Director of Engineering, Net2phone Тезисы
    Использование Tarantool в .NET-проектах Анатолий Попов Director of Engineering, Net2Phone Тезисы • Что такое NewSql? Куда делся NoSql? • Как использовать Tarantool из .net? • Производительность progaudi.tarantool 2 Обо мне • Работаю с .net с 2006 года • Активно в OSS с 2016 года 3 Тот, кто не помнит прошлого, обречён на его повторение. Джордж Сантаяна, Жизнь разума, 1905 4 RDBMS • General purpose database • Usually SQL • Developed since 1990s or so 5 NoSql • Strozzi NoSQL open-source relational database – 1999 • Open source distributed, non relational databases – 2009 • Types: • Column • Document • KV • Graph • etc 6 Цели создания • Простота: дизайна и администрирования • Высокая пропускная способность • Более эффективное использование памяти • Горизонтальное масштабирование 7 Shiny new code: • RDBMS are 25 year old legacy code lines that should be retired in favor of a collection of “from scratch” specialized engines. The DBMS vendors (and the research community) should start with a clean sheet of paper and design systems for tomorrow’s requirements, not continue to push code lines and architectures designed for yesterday’s needs “The End of an Architectural Era” Michael Stonebraker et al. 8 Результат 9 Недостатки • Eventual consistency • Ad-hoc query, data export/import, reporting • Шардинг всё ещё сложный • MySQL is fast enough for 90% websites 10 NewSQL • Matthew Aslett in a 2011 • Relations and SQL • ACID • Бонусы NoSQL 11 NewSQL: код около данных • VoltDB: Java & sql • Sql Server: .net & sql native • Tarantool: lua 12 Sql Server
    [Show full text]