Thinking About Big Data on the Eve of Spring Trade Show Season

Reading time for this article .
Subscribe to the Spectra Blog >

Spectra Logic’s Kevin Dudak recently became a contributing blogger for the Inside Big Data  Blog. His first post Thinking About Big Data on the Eve of Spring Trade Show Season has been reprinted below with permission from Rich Brueckner:

Thinking about Big Data on the eve of the spring trade show season

The month of March brings longer days, warmer weather and the start of the spring trade show season.  There seem to be as many trade shows as there are interest and industries.  Last year, we saw a lot of people start talking about Big Data at these shows.  The trend most likely will continue, with Big Data taking a bigger share of the conversation.  

Given the years I have been in the storage industry, it should come as no surprise that I tend to look at the storage part of Big Data.  Over the last year we have heard a lot about the analytics side of Big Data.  It is exciting seeing all the amazing things we can do, and things we can learn from the massive amount of data we have at our finger tips these days.  Without a doubt, we will continue to see much of the conversation focus on leveraging our data sets with tools like Hadoop.  Sometimes, it seems we forget that Big Data is more than just the analytics; it is also about storing and managing potentially massive data sets.  2012 will see users and vendors starting to address the changes Big Data brings to storage.

The 2012 Tape Summit and the HPC Symposium kick off the season.    The second annual Tape Summit is the gathering of top manufactures in the Data Tape, including drive, library, software and media companies; as well as press, analysts and bloggers.  You don’t see tape and Big Data in the same conversation too often, but I think the tape industry will be looking to change that this year.  We will be hearing about Linear Tape File System (LTFS,) continued innovation in data management software and possibly the coming LTO6 and how all of these can have a big impact on storing lots of data.

The HPC Symposium will see presentations from some of the top organizations in the distributed high performance world.  Many of the lessons the HPC world has learned over the last 5 years will make the adoption of Big Data easier and more effective. 

I’ll be watching to see how LTFS might be a good answer to Big Data portability.  We are seeing LTFS gain traction in some verticals like Media and Entertainment already.  The question of how to move Petabytes of data, either to seed a cloud provider or just move to a different location has always been a problem.  LTFS might just provide a good answer.

Dealing with massive data sets, be it integrity checking the data or protecting it is a struggle we all face at one time or another.  We are starting to see a new crop of software vendors, some in the Active Archive Alliance, that are creating data storage environments. 

Finally, with the expected shipment of LTO6 this calendar year, we will see a doubling of native capacity on media.  There should be performance improvements as well.  Since the LTO consortium is attending Tape Summit, hopefully we will get more details on it, and how it might affect the economy of storing big data.

As March rolls on, we should start to see a lot of information coming out of events such as the HPC Symposium and the Tape Summit on not only how to analyze Big Data, but how to manage and store it when it isn’t being crunch.