Clustering Your Unstructured Data

Arthur Cole

Dealing with growing mounds of unstructured data is proving to be a thornier issue than most people realized, throwing a kink into server and storage consolidation plans as managers face up to the sheer amount of data migration and network downtime that is involved.


A recent survey from the Taneja Group showed that more than half the respondents had 11 TB or more of unstructured data on their networks, with upwards of 60 percent saying that growth was averaging about 75 percent a year. The majority of that data is coming from Microsoft Office, e-mail and backup/archival systems.


One of the more effective approaches to managing unstructured data is the growing field of clustered storage. While there are about as many definitions of clustered storage as there are solutions on the market, the term generally refers to multiple networked storage elements connected via high-bandwidth pathways optimized for large, sequential read/write operations. Most solutions allow you to scale not only capacity, but throughput and other functions as well.


The newest solutions on the market are pushing the scalability factor light years beyond what can be found on traditional SAN/NAS systems. Isilon Systems recently introduced the IQ12000 and EX 12000 systems that can jump to 1.6 PB on a single volume, according to Using an ultra-dense configuration of 1 TB Hitachi drives, the system packs 12 TB onto a 2U form factor, allowing a 250 TB cluster in a single rack.


Management of clustered systems is also getting easier. Sanbolic's new Melio 2008 and LaScala 2008 software stacks for Windows Server 2008 offer centralized admin for both clustered storage and virtual storage pools by enabling concurrent shared read/write access from multiple servers. Scalability and availability of the cluster is enhanced through unlimited LUN sizes and support for Basic and GPT disk structures.


Clustered solutions also look to be part of EMC's future, with CEO Joe Tucci telling a recent group of analysts and press that it will soon come out with a line of scalable clustered NAS devices designed for use with clustering platforms from HP, Isilon, Exanet and BlueArc. The hardware side of the project is codenamed Hulk, while the clustered file system goes by the name of Maui.


It wasn't too long ago that clustered storage was seen as a luxury that only the largest enterprises could even consider. Now that the value of unstructured data has been recognized, in both the competitive and regulatory sense, it shouldn't be long before market forces push the technology down market.

Add Comment      Leave a comment on this blog post

Post a comment





(Maximum characters: 1200). You have 1200 characters left.




Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.