I’M CONSTANTLY HEARING ABOUT “BIG DATA,” BUT THE TERM SEEMS VAGUE. WHAT IS IT?
It is vague, but it’s real. A good working definition is: data that is too large, complex, and dynamic for conventional data tools to capture, store, manage, and analyze.

WHAT’S THE BIG DEAL? IS THERE REALLY THAT MUCH DATA?
Every minute we send over 200 million emails, click almost 2 million likes on Facebook, send almost 300,000 tweets, upload 200,000 photos to Facebook, and 100 hours of video to YouTube. Google chairman Eric Schmidt estimates that “from the dawn of civilization until 2003, humankind generated five exabytes [five billion gigabytes] of data. Now we produce five exabytes every two days … and the pace is accelerating.”

As the infographic shows, the last few years have seen a vast proliferation of data. What’s noteworthy is not so much the amount as the huge opportunity it presents to learn from it. Airlines can predict which routes customers will want to fly at which times of day. Manufacturers can predict equipment failure and perform preventive maintenance. As the amount of data has exploded, so has our ability to mine it for value.
WHY IS THERE SO MUCH DATA?
One consultant, Bernard Marr, calls it the “datafication of the world.”

Consider listening to music. Twenty years ago, listening to broadcast radio, or purchasing and listening to a CD, generated almost no data. Now, music streaming services gather extensive data on which songs we listen to, when and how long, which songs we skip, our ratings and purchases, and which ads we hear and respond to.

Cell phones track our location, generating a continuous stream of geographic information. Phone calls and text messages, emails, clicks on web links, likes, retail transactions—each generates data that can be recorded, aggregated, and analyzed. Analysis of a single human genome generates about 3 billion base-pairs of information. Security and surveillance cameras are constantly recording video that is stored and analyzed. Data is gathered from sensors and telemetry. We live our lives increasingly online, and all of it can be “datafied.”
IT SOUNDS LIKE A DATA WAREHOUSE. ISN’T THAT WHAT A RELATIONAL DATABASE MANAGEMENT SYSTEM IS FOR?
It’s what data warehouses have evolved into. But Big Data is orders of magnitude beyond the capabilities of a traditional RDBMS. Relational databases were designed to store and retrieve structured data—customer records, part numbers, order information. They can’t efficiently handle the massive volumes of structured (tabular) and unstructured (freeform) data we are now generating. (See our upcoming post, Unstructured Data, for more on this.)
WHAT ARE THE CHALLENGES OF BIG DATA?

Imagine the complexities of developing a system to efficiently store and rapidly access these vast quantities of data. Tech consulting firm Gartner has expressed them as the three Vs shown in the figure—Data Volume, Velocity, and Variety. To them are sometimes added Data Veracity, Variability, and Complexity. For more on what these terms mean, see our post on Big Data Challenges.
IS IT WORTH THE TROUBLE?
It’s an inevitable business reality. Many modern businesses face an expansive growth of unstructured data containing crucial business information that they can’t access using traditional data systems. Big Data is manageable only with a suitable platform and tools. If you aren’t using that data, your competitors soon will.

For example, credit card companies use Big Data methods to detect irregular purchase patterns that may indicate fraud. Brokerages use similar methods to detect securities fraud. Their customers now expect these methods and they would fall quickly behind without them. You’ll find many more examples in our upcoming post, Big Data Examples.
CAN BIG DATA HELP MY BUSINESS?
Big Data can open up new analytical opportunities that were previously infeasible. For more details and some suggested next steps, see our upcoming post, Big Data Opportunities.
WHERE CAN I LEARN MORE ABOUT BIG DATA?
See our posts on Big Data Examples, Big Data Challenges, andBig Data Opportunities. We also recommend these resources for a deeper dive into Big Data:

WHO IS PEAXY? WHY ARE YOU TELLING ME THIS?
Peaxy software empowers universal data access for enterprise – so you can save, find, analyze, manage and reuse your data – whenever it was created, wherever it is located. Our Executive Series is designed to help to get you up to speed quickly on the key topics related to big data and data access.