What do you do with accidental Big Data?

Reading time for this article .

Spectra Logic’s Kevin Dudak is a contributing blogger for the Inside Big Data Blog. His most recent post has been reprinted below with permission from Rich Brueckner:

I got to thinking about this after an hour-long call with a customer that I only expected to take 5 minutes. He is quickly approaching 100 PB of data and does not have a plan for it. He has multiple disk and tape storage systems, with four different software solutions that manage portions of the data. I don’t think the company ever expected to grow to this size when it made its software and hardware decisions over the last 10 years. They are now facing several major challenges:

  • Knowledge.
    • They have too many types of hardware and software for their staff to remain competent with all of them.
  • Support.
    • With a number of different systems, the amount of support contracts is difficult to manage, let alone deciphering the complexities of keeping everything running.
  • Power.
    • Power and cooling are crushing them. The monthly bill is affecting the finances of the company and they are struggling to be able to obtain more power to grow.
  • Data.
    • In the end, this should all be about the data, but with the data spread across so many systems and technologies, it is difficult to access and use, at best.

This company is not alone in their challenges. And it has happened to far too many organizations out there. Data islands and different storage systems all made sense when they were deployed and IT looked at them as single, standalone solutions. Several years and a lot of growth later, and many companies that didn’t consider themselves a ‘data company’ now find themselves with Big Data. The challenge now is to figure out how to get out of the unplanned mess they are in and get things straightened out.

This is a challenge that users, integrators and manufactures should be working on for the next few years. As we talk about how to solve these problems, I think the first step is to focus on the data. The data is the reason we have all this storage and computing resources. There are a number of things being done to solve these challenges. I’ll be sharing more about this in future posts.