I’ve mentioned in previous posts that Big Data is more than just big. In order to realize its true value, it must be fast as well.
That means analysis has to approach real-time levels in order to ensure that the final product is relevant to the rapidly changing business environments in which most enterprises find themselves. And therein lies the problem, because while Big Data analytics platforms can be deployed on existing data center infrastructure, producing a real-time architecture will take a bit of work.
Hitachi Data Systems recently completed a study of UK organizations that have implemented Big Data strategies and found that more than half were still relying on outdated or inaccurate information because their legacy infrastructure could not meet the demands of real-time analytics. A key problem remains the stubborn presence of data silos within existing infrastructure, which prevent analytics engines from gaining a true picture of both structured and unstructured data sets. Not to mention, critical data is often kept hidden from decision makers because it can’t be made available on an organization-wide basis.
Lately, however, new hardware systems have hit the channel designed to facilitate the kind of real-time performance that Big Data requires. Scale-out, solid-state storage arrays, for example, are aiming to fulfill both the capacity and throughput demands of real-time analytics. For example, SolidFire recently introduced the SF9101 platform that can be configured with up to 100 nodes of high-speed, high-density storage at a price point under $3 per GB and $1 per I/O.
As for processing, new high-end models are taking up the Big Data cause after years of ceding ground to low-end, commodity systems. IBM recently added a new server, the 7R4, to its PowerLinux line. It is geared toward supporting EnterpriseDB workloads as well as large Java-based and WebSphere applications. The device holds four sockets and 32 cores and runs the same PowerVM virtualization as the rest of the Power family, allowing users to run Linux, AIX or iOS across multiple partitions. And through its support for open source projects like OpenStack, KVM and Apache, the company says it is in a good position to support advanced cloud architectures.
That could become a crucial element as enterprises look to shore up their infrastructure for Big Data. Given the choice between retrofitting existing plants and leasing capacity on the cloud, many organizations are likely to choose the latter. As Red Hat’s Gerald Sternagl points out, hybrid infrastructure provides the scalability and performance characteristics that support Big Data analytics while still preserving critical data behind the company firewall. And through open source APIs (this is Red Hat, remember), organizations can avoid the kind of vendor lock-in that hampers many cloud-based data infrastructure solutions.
Big Data can best be viewed as one of those challenge/opportunity examples. On the challenge side, infrastructure upgrades are certainly expensive and are difficult to coordinate over the long term, given the pace of change that is affecting data environments. But with virtual, cloud and software-defined technologies hitting the channel, reconfiguring and repurposing underlying hardware will become easier and less expensive, which will allow the enterprise to become more adaptable as data requirements, big or otherwise, shift over time.