What Is Big Data?

What is Big Data? Enter Table of Contents Defntion of Big Data 04 The History of Big Data 08 Big Data Use Cases 10 How Big Data Works 13 Big Data Best Practices 15 Put Big Data to Work 18 Defnition of Big Data 4 What exactly is big data? To really understand big data, it’s helpful to have some historical background. Here’s Gartner’s To put it simply: big data is larger, more defnition, circa 2001(which is still the go-to complex data sets, especially from new data defnition): sources. These data sets are so voluminous that traditional data processing software just can’t “Big data is data that contains greater variety manage them. But these massive volumes of arriving in increasing volumes and with ever data can be used to address business problems higher velocity. This is known as the three Vs.” you wouldn’t have been able to tackle before. 5 THE THREE VS OF BIG DATA Velocity. Velocity is the fast rate at which data is received and (perhaps) acted on. Normally, • Volume. The amount of data matters. With the highest velocity of data streams directly into big data, you’ll have to process high volumes memory versus being written to disk. Some of low-density, unstructured data. This can be internet-enabled smart products operate in real data of unknown value, such as Twitter data time or near real time and will require feeds, clickstreams on a webpage or a mobile real-time evaluation and action. app, or sensor-enabled equipment. For some organizations, this might be tens of • Variety. In today’s big data world, data terabytes of data. For others, it may be comes in new unstructured data types. hundreds of petabytes. Unstructured and semi-structured data types, such as text, audio, and video require addition preprocessing to derive meaning and support metadata. 1 Volume Velocity 2 BIG DATA Variety 3 6 THE VALUE—AND TRUTH—OF Recent technological breakthroughs have exponentially reduced the cost of data storage BIG DATA and compute, making it easier and less expensive to store more data than ever before. With an Since 2001, two more Vs have become apparent: increased volume of big data now cheaper and value and veracity. Data has intrinsic value. But more accessible, you can make more accurate it’s of no use until that value is discovered. and precise business decisions. Equally important: How truthful is your data—and Finding value in big data isn’t only about how much can you rely on it? analyzing it (which is a whole other beneft). It’s an entire discovery process that requires Today, big data has become capital. Think of insightful analysts, business users, and some of the world’s biggest tech companies. A executives who ask the right questions, recognize large part of the value they offer comes from patterns, make informed assumptions, and their data, which they’re constantly analyzing to predict behavior. produce more effciency and develop new products. But how did we get here? 7 The History of Big Data 8 Around 2005, people began to realize just how While big data has come far, its popularity is only much data users generated through Facebook, just beginning. Cloud computing has expanded YouTube, and other online services. Hadoop (an big data possibilities even further. open-source framework created specifcally to store and analyze big data sets) was developed The cloud offers a truly elastic scalability, where that same year. NoSQL also began to gain developers can just spin up ad hoc clusters to popularity during this time. test around a subset of data. It’s an exciting time to see what’s going to happen next. The development of open-source frameworks, such as Hadoop (and more recently, Spark) was essential for the growth of big data because they BENEFITS OF BIG DATA AND make big data easier to work with and cheaper to DATA ANALYTICS store. In the years since then, the volume of big data has skyrocketed. Users are still generating THE VALUE OF BIG DATA COMES IS TWOFOLD: huge amounts of data—but it’s not just humans. 1. Big data makes it possible for you to gain more complete answers because you have With the advent of Internet of Things (IoT) , more more information objects and devices are connected to the internet, gathering data on customer usage patterns and 2. More complete answers means more product performance. The emergence of machine confdence in the data–which means learning has produced still more data. a completely different approach to tackling problems 9 Big Data Use Cases 10 This new knowledge enables you to handle use CUSTOMER EXPERIENCE cases that you haven’t been able to fully delve into before. Here are just a few. (More use cases The race for customers is on. A clearer view of are on our solutions page): customer experience is more possible now than ever before. Big data enables you to gather data from social media, web visits, call logs, and other data sources to improve the interaction PRODUCT DEVELOPMENT experience and maximize the value delivered. Start delivering personalized offers, reduce Companies like Netfix and Procter & Gamble customer churn, and handle issues proactively. use big data to anticipate customer demand. By classifying key attributes of past and current products or services, and then modeling the relationship between those attributes and the FRAUD AND COMPLIANCE commercial success of the offerings, they build predictive models for new products and services. When it comes to security, it’s not just a few In addition, P&G uses data and analytics from rogue hackers; you’re up against entire expert focus groups, social media, test markets, and teams. Security landscapes and compliance early store rollouts to plan, produce, and launch requirements are constantly evolving. Big new products. data helps you identify patterns in data that indicate fraud and aggregate large volumes of information to make regulatory reporting much faster. PREDICTIVE MAINTENANCE Factors that can predict mechanical failures may be deeply buried in structured data, such as the MACHINE LEARNING equipment year, make, and model, as well as in unstructured data that covers millions of log Machine learning is the hottest of hot topics right entries, senor data, error messages, and engine now. And data—specifcally big data—is one of temperature., By analyzing these indications of the reasons why. It’s only recently that we’ve potential issues before the problems happen, been able to teach machines instead of program organizations can deploy maintenance more cost them. And the availability of big data to train effectively and maximize parts and machine-learning models makes that happen. equipment uptime. 11 OPERATIONAL EFFICIENCY BIG DATA CHALLENGES Operational effciency may not always make the While big data holds a lot of promise, it is not news, but it’s an area in which big data is having without its challenges. the most impact. With big data, you can analyze First, big data is… big. Although new and assess production, customer feedback and technologies have been developed to store data, returns, and other factors to reduce outages and data volumes are doubling in size around every anticipate future demands. Big data can also be two years. Organizations still struggle to keep used to improve decision-making in line with pace with their data and fnd ways to effectively current market demand. store it. But it’s not enough to just store the data. Data must be used to be valuable, and that depends DRIVE INNOVATION on curation. Clean data, or data that’s relevant Big data can help you innovate by studying to the client and organized in a way that enables interdependencies between humans, institutions, meaningful analysis requires a lot of work. entities, and process and then determining new Data scientists spend 50 to 80 percent of their ways to use those insights. Use data insights to time curating and preparing data before it can improve decisions about fnancial and planning actually be used. considerations. Examine trends and what customers want to deliver new products and Finally, big data technology is changing at a fast services. Implement dynamic pricing. There are pace. A few years ago, Apache Hadoop was the endless possibilities. popular technology used to handle big data. That is, until Apache Spark was introduced in 2014. Today, a combination of the two frameworks appears to be the best approach. Keeping up with big data technology is an ongoing challenge. 12 How Big Data Works 13 Oracle Cloud for Big Data Analytics Data Integrate Analyze Data Providers Consumers Adaptive Intelligence Public Data Services Manage Streaming Business Analytics Enterprise Enterprise Apps Big data gives you new insights that open up You can store your data in any form you want new opportunities and business models. Getting and bring your desired processing requirements started involves three key actions: and necessary process engines to those data sets on an on-demand basis. Many people choose their storage solution according to where their data is currently residing. The cloud is gradually INTEGRATE gaining popularity because it supports your current compute requirements and enables you Big data brings together data from many to spin up resources as needed. disparate sources and applications. Traditional data integration mechanisms, such as ETL (extract, transform, and load) generally aren’t up to the task. It requires new strategies and technologies to analyze big data sets at terabyte, ANALYZE or even petabyte, scale. At the same time, big Your investment in big data pays off when you data has the same requirements for quality, analyze and act on your data.

What Is Big Data?

How Big Data Enables Economic Harm to Consumers, Especially to Low-Income and Other Vulnerable Sectors of the Population

Issues, Challenges, and Solutions: Big Data Mining

Oracle Big Data SQL Release 4.1

Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners

Internet of Things Big Data Artificial Intelligence

Artificial Intelligence and Big Data – Innovation Landscape Brief

Big Data Platforms, Tools, and Research at IBM

Artificial Intelligence, Big Data and Cloud Taxonomy

Big Data, Data Mining and Machine Learning

Hard Disk Drive Failure Detection with Recurrence Quantification Analysis

How to Store and Process Big Data: Are Today’S Databases Suﬀicient? Jaroslav Pokorný

Abstract Introduction Relational Database Management Systems (RDBMDS)