<<

-Guide SOLVING THE PROBLEM WITH BIG DATA SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

Home

How big data and distributed systems solve traditional scalability problems

Lacking NoSQL stan- dards more danger- ous than proprietary he amount and variety of data collected is vendor -in? T continuing to expand and the need to mine this data is imperative. Those fac- tors – in combination with the expansion of client devices, demand for highly componentized systems, and more – are making centralized a thing of the past. This expert e-guide outlines how embracing a more distributed archi- tecture addresses those issues, especially in terms of big data. Learn how the advent of NoSQL options can provide an embarrassment of riches once standardization is the norm.

PAGE 2 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

HOW BIG DATA AND DISTRIBUTED SYSTEMS SOLVE TRADI- TIONAL SCALABILITY PROBLEMS

Home It’s rare to see an enterprise that relies solely on centralized computing. But How big data and there are nevertheless still many organizations that do keep a tight grip on distributed systems solve traditional their internal data center and eschew any more distribution than is absolutely scalability problems necessary. Sometimes, this is due to investments in existing infrastructure. At Lacking NoSQL stan- other times, it is due to security concerns that arise from a risk averse culture. dards more danger- ous than proprietary However, centralization is becoming less and less feasible due to a number of vendor lock-in? unavoidable factors:

The number and variety of client devices increases year over year, creat- ing an increasingly complex array of end points to serve

The amount and variety of data collected continues to expand exponen- tially with the use of social, mobile, and embedded technology

The need to mine this data for business insights becomes imperative in

PAGE 3 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

a competitive marketplace

The requirements of continuous development and deployment create a demand for systems that are highly componentized for greater agility Home and flexibility (SOA) How big data and distributed systems solve traditional The cost of scaling internally to provide the computing resources to keep scalability problems up with demand while maintaining acceptable performance levels be- Lacking NoSQL stan- comes too high to handle from both an administrative and infrastructure dards more danger- ous than proprietary standpoint vendor lock-in? Having a potential single point of failure is unacceptable in an era when decisions are made in real time, loss of access to business data is a disas- ter, and end users don’t tolerate “downtime”

So how does embracing a more distributed architecture address the afore- mentioned issues? Different aspects of the paradigm resolve different types of performance issues. Here are just a few examples:

PAGE 4 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

PEER PRESSURE IS A GOOD THING The peer-to-peer distributed computing model ensures uninterrupted uptime and access to applications and data even in the event of partial system failure. Some vendor SLAs guarantee high availability with 99% uptime or higher, a Home feat which few enterprises can match using centralized computing. Automated How big data and failover mechanisms mean end users are often unaware that there is even a distributed systems solve traditional problem since communication with the servers is not compromised. What scalability problems about latency issues? SLAs may also be customized with specific performance Lacking NoSQL stan- metrics for response time and other factors that align with business objectives. dards more danger- ous than proprietary vendor lock-in? THE SKY IS THE LIMIT The “virtually” unlimited scalability of provides the ability to increase or decrease usage of infrastructure resources on demand. Instant and automated provisioning and de-provisioning of servers and other resources allow enterprises to perform better by ensuring that end user access to appli- cations keeps up with simultaneous, resource intensive demand – even when traffic spikes unexpectedly.

PAGE 5 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

DATA IS A BIG DEAL The use of distributed systems also has implications for “Big Data”. The advent of NoSQL options provides an opportunity for enterprises to bifurcate their data stream to accept and fully utilize both relational data via SQL DBs and Home non-relational data with DB options such as MarkLogic and MongoDB. Arnon How big data and Rotem-Gal-Oz, Architecture Director at Nice Systems, points out that SQL still distributed systems solve traditional has the edge when it comes to reporting functionality, security and manage- scalability problems ability. On the other hand, he admits, “If you have scale problems that are hard Lacking NoSQL stan- or expensive to solve with traditional technologies, NoSQL fills these needs in dards more danger- ous than proprietary ways that you didn’t have before.” vendor lock-in? Implementing native applications that run on thick clients relieves serv- ers of some of their workload and can deliver a faster and more user-friendly experience (assuming there isn’t a need to update data frequently between the client and the server). Using a tiered structure that divides responsibilities among web, application and data servers can permit organizations to outsource any of these processes or layers that can be handled most effectively by a third party vendor. This type of multi-tiered distributed computing can also be used to lessen the burden on internal servers even when deploying applications for thin clients such as mobile devices.

PAGE 6 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

BARGAIN BASEMENT PRICING Large scale distributed technology has reached the point where third party data center and cloud providers can squeeze every last drop of processing power out of their CPUs to drive costs down further than ever Home before. Even an enterprise-class private cloud may reduce overall costs if it How big data and is implemented appropriately. The number of vendors in the cloud arena is distributed systems solve traditional still growing, leading to more and more competitive pricing arrangements. In scalability problems addition to lowering costs, relieving the administrative burden from internal Lacking NoSQL stan- IT personnel may free up resources for developing applications that improve dards more danger- ous than proprietary performance in other ways. vendor lock-in?

VERSATILITY IN TECHNOLOGY CHOICES A distributed architecture is able to serve as an umbrella for many different systems. Hadoop is just one example of a framework that can bring together a broad array of tools such as (according to Apache.org):

Hadoop Distributed File System that provides high-throughput access to application data

PAGE 7 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

Hadoop YARN for job and cluster resource management

Hadoop MapReduce for parallel processing of big data

Home Pig high-level data-flow language for parallel computation How big data and distributed systems solve traditional ZooKeeper high performance coordination service for distributed scalability problems applications Lacking NoSQL stan- dards more danger- ous than proprietary This framework may be of special interest to enterprises since some very vendor lock-in? bright minds are working on commercialization projects at Yale University right now in concert with Hadapt. Dr. Daniel Abadi believes that “Hadoop is going to make it to the next level. We saw a lot of adoption in 2012. Now it’s about trying to figure out the ‘perfect’ Hadoop use cases. So, building some vertical-specific applications is going to be a pretty big trend for 2013.” Those use cases that increase distributed computing and business performance will be trailblazers to watch.

PAGE 8 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

LACKING NOSQL STANDARDS MORE DANGEROUS THAN PROPRIETARY VENDOR LOCK-IN?

Home For enterprise architects and software developers interested in exploring a vast How big data and array of options for managing application data, the choice in data persistence distributed systems solve traditional solutions is no longer limited to the relational world, as the range and depth scalability problems of NoSQL options has become an embarrassment of riches. MongoDB, Cas- Lacking NoSQL stan- sandra, CouchBase, NeoJ4, VoltDB, ElasticSearch and Reddis makes up only dards more danger- ous than proprietary a short list of the various NoSQL options available. Almost anything you want vendor lock-in? to do can be done using the various NoSQL products on the market today. But as any software architect knows, it is not always possible to get ev- erything that is needed from a single database product. As a result, seasoned professionals aren’t afraid to Hadoop together a combination of database solu- tions to approximate what is really needed. Of course, since there are so many different factors involved with NoSQL, this can rapidly become a challenge. Some of the options and features an application architect needs to consider when choosing a solution for handling persistence include:

PAGE 9 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

Programming languages

Querying languages

Home Architectures How big data and distributed systems solve traditional Reporting functions scalability problems

Lacking NoSQL stan- Connectors dards more danger- ous than proprietary vendor lock-in? Hardware or infrastructure requirements

Licenses

Protocols

Operating systems and more

PAGE 10 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

NOSQL, NO STANDARDS, NO END IN SIGHT Many enterprises are simply trying to decide what’s most important and are thusly implementing the NoSQL solution that comes closest to delivering what they need right now. Then, they cross their fingers and hope that they don’t have Home to switch vendors anytime soon. They know deep down that even in an age of How big data and less contractual and licensing red tape, portability would be a huge problem. distributed systems solve traditional The lock in is a byproduct of the fact that there are no standards. scalability problems Ann Kelly and Dan McCreary, coauthors of Making Sense of NoSQL, have Lacking NoSQL stan- plenty to say about this dearth of standardization. That is after all, one of the dards more danger- ous than proprietary things that seems to make NoSense in the NoSQL world. Other authors have vendor lock-in? claimed that the lack of standardization is simply an unavoidable feature of having such a wide variety of solutions. But that’s kind of a cop-out. Kelly points out that all you have to do to uncover the real reason is to follow the money. It’s not in the short term financial interest of venture capitalist firms tosupport standards. “They’re goal up front is to get a return on their investment. You don’t get that by investing in standards. You can only get a return on your in- vestment if your solution is unique enough for people to adopt it.” The splintered nature of the current NoSQL market creates an ideal envi- ronment to maximize short term profits. Of course, in the long term, it will pay

PAGE 11 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

off to have standards. The few niche players that survive the blood bath and become the standard setters will attract lots of third party developers, helping grow demand for their products.

Home THE PUSH TOWARDS STANDARDIZATION How big data and The idea that there’s simply nothing that can be done isn’t accurate either. Mc- distributed systems solve traditional Creary revealed that useful and relevant standards are already available. It’s scalability problems just that efforts to make them relevant industry-wide have been spotty. “There Lacking NoSQL stan- are standards that NoSQL developers could implement. In fact, there are third dards more danger- ous than proprietary parties that are grafting them on. There are two separate efforts now to put vendor lock-in? XQuery interfaces on MongoDB. That means I could port my data between MarkLogic which is a native XML system and MongoDB and maybe 3 other native XML that have good scale characteristics.” The present may look muddled, but there is hope for the future. As Mc- Creary says, “Standards are beautiful, wonderful things, but you cannot pre- dict when they’re going to be adopted. There are uncontrollable events that suddenly allow them to take off.” As the Sam Cooke song says, change is gonna come. We just don’t know when or from what direction. All it took was Steve Jobs saying, “No Flash on Apple products, ever!” to get the ball rolling on SVG

PAGE 12 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

standards. So, there may be a corresponding flashpoint for NoSQL in the com- mercial field as well. That will be a glad day. Software architects are waiting with baited breath for a champion to appear, and for standards to be the norm when it comes to working with NoSQL solutions. Home

How big data and distributed systems solve traditional scalability problems

Lacking NoSQL stan- dards more danger- ous than proprietary vendor lock-in?

PAGE 13 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

FREE RESOURCES FOR TECHNOLOGY PROFESSIONALS TechTarget publishes targeted technology media that address your need for information and resources for researching prod- ucts, developing strategy and making cost-effective purchase Home decisions. Our network of technology-specific Web sites gives you access to industry experts, independent content and analy- How big data and sis and the Web’s largest of vendor-provided white pa- distributed systems pers, webcasts, podcasts, videos, virtual trade shows, research solve traditional scalability problems reports and more —drawing on the rich R&D resources of technology providers to address market trends, challenges and solutions. Our live events and virtual seminars give you ac- Lacking NoSQL stan- cess to vendor neutral, expert commentary and advice on the issues and challenges you dards more danger- ous than proprietary face daily. Our social community IT Knowledge Exchange allows you to share real world vendor lock-in? information in real time with peers and experts.

WHAT MAKES TECHTARGET UNIQUE? TechTarget is squarely focused on the enterprise IT space. Our team of editors and net- work of industry experts provide the richest, most relevant content to IT professionals and management. We leverage the immediacy of the Web, the networking and face-to-face op- portunities of events and virtual events, and the ability to interact with peers—all to create compelling and actionable information for enterprise IT professionals across all industries and markets.

PAGE 14 OF 15 SPONSORED BY SOLVING THE SCALABILITY PROBLEM WITH BIG DATA

RELATED TECHTARGET WEBSITES

Home

How big data and distributed systems solve traditional scalability problems

Lacking NoSQL stan- dards more danger- ous than proprietary vendor lock-in?

PAGE 15 OF 15 SPONSORED BY