Adding scale to enterprise infrastructure presents more than its share of challenges beyond mere upgrades to capacity, processing power and throughput. Power and data management, availability, systems interoperability and a host of other factors become more complex and intractable as resources bulk up.
But since the enterprise simply won’t survive for much longer without the ability to scale, solutions to these problems must move in tandem with the expanding infrastructure. This is particularly crucial in the storage farm, which has seen tremendous gains in speed and capacity in recent years but is still lagging in key management and functionality points compared to servers and networking.
A key problem is the fact that most scaled solutions are designed for the rapid, high-volume environments of ecommerce and collaborative communications, not the single, large workloads of Big Data. When crunching large data sets, storage needs to be pooled into massive, unified resource sets, what experts are calling “data lakes.” In fact, EMC has built the new Isilon OneFS platform around the need to store and manage huge amounts of unstructured data, leveraging the Hadoop Distributed File System (HDFS) to build single-file, single-volume support for analytics on the petabyte scale.
But where there is opportunity, there are people looking to score, and the scale-up/out storage market is no exception. According to ZDNet, a company called Infinidat is taking direct aim at EMC where it hurts—in the large storage array. The InfiniBox is a 2PB full-rack solution that is said to provide seven 9s reliability for block, file and object storage at a price point as low as $200,000. The system features native access to OpenStack and VMware environments and offers upwards of 48TB of Flash-based cache per controller. As well, it has RAID 6 protection and can deliver a full disk rebuilt in about 15 minutes.
But since scale is about more than just size, some are wondering if an entirely new, or perhaps an old, approach to data management in large volume settings is warranted. According to Enterprise Storage Forum’s Jeff Layton, the recent International Conference on Massive Storage Systems and Technology (MSST) in Santa Clara, Calif., was abuzz with talk about key-value storage. The idea is to use unique, non-repeatable keys within each data set so that data can be located and retrieved without having to process the entire key-value pair. This approach has turned up on object-based archival systems over the past few years, but word at the conference is that it can apply to large, file-based environments as well, and is in fact already showing up in devices like Seagate’s new Kinetic drive.
Whether you are scaling for high-volume or Big Data, however, all Web-scale architecture needs to meet a few key criteria, says Coho Data’s Scott Lowe. First, make sure it offers horizontal scaling (scale out) that provides both capacity and performance on demand. As well, look for multi-tier capability and automated management to reduce costs and management overhead. And the more you can deploy commodity hardware, the lower the TCO.
It would be nice if scaling up storage meant simply adding capacity to existing infrastructure, but that is not going to cut it at most enterprises. Emerging data requirements are forcing wholesale changes to the way data is managed, processed and stored, and that means we must fundamentally rethink the storage farm and the value it brings to the business process.
This is not an impossible task, but it needs to start with the recognition that what is best for the future is not just more of what has worked in the past.