In today’s business environment, data is a valuable asset – provided it lives, where it is most efficient. Active Archiving removes a set of reference data from an overloaded relational database and keeps it active in an archive where it can be easily and quickly retrieved when needed.
Floyd Christofferson, Director of Storage Product Marketing, SGI, suggests how this might be achieved.
The mainstream adoption of HD, 3D, mobile and streaming services presents an archiving challenge for the digital media industry, in scaling storage and support systems cost effectively, and therefore providing sufficient capacity and speed of information retrieval required.
Even though more and more digital media files are filling up ever-larger disk silos, propelled by the proliferation of mediums, the amount of data is growing quicker than the need to access it. For the digital media sector, specific files are rarely accessed, but the key is for that access to be immediate and fast; business users and consumers want them available at all times.
For some businesses this challenge would be addressed through better data management, but translated to the digital media and production sector the challenge becomes astounding. The problem is more critical than the realm of personal preference where online media is accessible instantly, through such as services like BBC iPlayer or Spotify. Instead the issue is a business necessity, where the business needs to have access to the full range of data at all times.
Always-on and accessible
An active archive means data is always available in an ‘online’ state. In the context of an active archive, ‘online’ means that the data is available in an environment that is immediately and easily accessible to users, that is not drawing power or taking up unnecessary space, and one in which the data is protected for a long time.
An active archive strategy, when properly applied, significantly reduces overall storage and data management costs whilst improving efficiencies and the ability for users to access all data.
In essence, the data should live where it is most efficient. For example, inactive data, which has retention value can be moved into an archive tier storage that, although ‘online’ and visible to the user, is typically in a powered-down state using Massive Array of Idle Disks (MAID) technology that completely removes power from the array. These archives, while still available to users, can be managed with very different disaster recovery techniques that require less investment, and at a fraction of the operational costs of conventional disk-based file stores.
This is a sharp contrast to a traditional archiving approach, where data often ends up residing in an off-site data tape store that required hours, if not days for data retrieval.
Implementing an Active Archive
There are numerous tools that can simplify the implementation of an active archive strategy. These can be categorised as:
• Digital Asset Management: Leading digital asset management systems automatically index content in multiple ways as it is created and modified. Using this meta data users can search for data, and administrators can easily set policies to automatically determine which data should remain on production disk drives and which can migrate to lower cost, higher efficiency second or third tier storage.
• Hierarchical Storage Management (Tier Virtualization): Another cost-effective technique that can aid in developing an active archive is to virtualize tiers of storage through the use of a hierarchical storage management solution. These enable multiple tiers of disk and tape to appear to users as one large aggregated volume even though the data is actually distributed across multiple storage types.
The beauty of this system is that all the data appears to the user to be online in the high speed, expensive, production disk at all times. But in reality, even though the file appears to be right where the user put it in the file system, it has actually migrated to lower cost storage. This approach delivers dramatic overall cost savings without the need for users to learn and follow where their content is located.
• Low power mass storage using MAID:
A MAID system is another significant tool in creating a lower cost active archive. By selectively powering down whole sections of the disk array until the data is needed MAID significantly reduces the power and cooling requirements of the data centre, much like tape libraries do, but with the added advantage of much higher performance and proactive data protection.
Protecting the Data That Is Your Business
An active archive strategy requires effective planning and deployment of management tools. When implemented effectively it can considerably reduce the overall cost of managing a growing pool of digital data. Individual components can be upgraded or changed without impacting the user experience. In this scenario, scalability becomes an asset, and not a headache.
Floyd Christofferson has focused on content management and storage workflows over the last 25 years, paying particular attention to the technology associated with effective management of massive volumes of data, both from the perspective of hardware strategies and of workflow. http://www.sgi.com/products/storage/archive
The Active Archive Alliance launched on April 27, 2010 as a collaborative industry association formed to educate end user organizations on the evolving new technologies that enable reliable, online and efficient access to their archived data. Alliance members strive to extend solutions beyond the high-end supercomputing and broadcast markets to the greater general IT audience that is in need of online data archive options. Founder members include SGI, Dell, QStar, FujiFilm, Spectra etc. http://www.activearchive.com/