A Deeper Look into the Data Management Benefits with the iRODS and Spectra Logic Integration

Reading time for this article .

Managing and safeguarding research data is important for replicating results, documenting collaborations, complying with governmental and institutional rules and regulations, planning future research, and ensuring published discovery is available to the public for the long term. So it’s not surprising that researchers, whether working within an HPC or general university environment, wished that it was possible to keep data forever, on a low-cost medium, with minimal risk of failure or data loss. Spectra Logic and the iRODS consortium recently announced an integration that will help researchers and organizations drive innovation and accelerate results.

A simple solution for storing complex research

Imagine that you’re a researcher, gathering data on a complex topic like cosmological simulations of galaxy formations or the human genome and the molecular mechanisms of cancer. You hope that there is a simple way for you to ensure that your data remains safe, accessible, and protected throughout the lifecycle of your study. You learn that there are, in fact, hundreds of options, varying in complexity, capacity, and capability, as well as varying costs. But what matters to you is your research data, not the underlying system on which that research data will inevitably reside. Seemingly the best scenario would be one that leverages industry standard interfaces, in a non-proprietary manner, and that allows you to continue to depend on the tools your team is already comfortable working with, like iRODS, in a novel manner to simplify future workflows.

Enter the Spectra Logic and iRODS integrated solution. Now you can use industry-standard interfaces (S3 and Glacier) to write your data to any S3-compliant appliance, including Spectra Logic’s BlackPearl Platform, with the potential advantage of adding a Glacier tier of tape or spin down disk sitting behind it. The solution allows researchers to focus on what is most important to them – gathering and analyzing research data – while knowing that the collected data is being stored on a medium that will always provide access to the information collected.

iRODS is an open source data management software focused on data discovery and virtualization, workflow automation and secure collaboration.

What makes this integration so unique?

Many researchers are already using iRODS and its many plugins to access, move and manage their current research data sets. With the joint solution, these same end users can continue to utilize iRODS to virtualize their data storage and management resources into collections, while now gaining access to a simple appliance that can be hosted on-premises, allowing researchers to utilize their data within any S3-compatible application. This provides them with full autonomy and ownership over the physical bits and bytes of the data they have collected, and enables them to connect to and use cloud resources, without relying solely on the cloud.

A few benefits of building a simple on-premises archive with iRODS and Spectra Logic include:

  • Utilizing industry-standard cloud interfaces (S3 and Glacier) for on-premises disk and on-premises tape data storage
  • Creating an air gap within the data storage system with glacier tape
  • Data access via iRODS credentials
  • Data migration via iRODS S3 plugin
  • Data management via iRODS Rule Engine or Spectra StorCycle Storage Lifecycle Management Software
  • Long-term retention of data for upwards of 15 years if stored on tape
  • Data discovery enhanced via robust metadata markup

Terrell Russell, executive director of the iRODS Consortium, commented on the iRODS and Spectra Logic integration, “We look forward to a lasting collaboration with Spectra Logic that will help our mutual customers drive innovation and accelerate business results.”