It seems almost a contradiction in terms. How could something as complex as data warehousing lend itself to the plug-and-play architecture of an appliance?
As I and many others have commented before, warehousing is more than just simple data storage. Heck, we have simple storage for that. Warehousing, on the other hand, incorporates a wide range of archiving, analysis, search and other functions to ensure that your data doesn't just rot away on some disk drive somewhere. Instead, warehousing enables the conversion of that data into usable intelligence.
It's this institutional knowledge that has made warehousing such a hot commodity of late. It allows you to dig through your past to identify and act on things like changing market and sales conditions, customer/partner relationships, and a raft of other influences.
Clearly, then, that kind of functionality can't be as simple as hooking up an appliance and flicking a switch, can it?
Well, perhaps, but it will depend largely on how well organized you are to begin with, according to those with the most experience with warehouse appliances. Warehousing specialist Timothy Leonard, for one, says it is certainly possible, particularly amid heavy volumes, where appliances have already proven to be cheaper and more flexible than traditional architectures. Of course, this does not mean data integration and extraction will be any easier, as those processes depend more on policy and processes than raw hardware.
It would be a mistake to think of appliances as just dumb boxes performing simple functions, according to Intelligent Enterprise's Dave Stoddard. As a quick perusal of new systems at the recent TDWI conference in San Diego revealed, the latest designs from IBM, Netezza, Oracle and others are using the latest blade servers loaded with collaborative technology, parallel processing and advanced networking protocols like InfiniBand. But that also means these are not just "set-it-and-forget-it" devices. They will need a fair amount of TLC to keep them running smoothly.
Of all the newest developments, Netezza's TwinFin system is drawing the most buzz. Most of the press is centered around whether the company's decision to use Intel-based boards rather than PowerPC chips constitutes a major change to the company's hardware architecture, although that will probably be irrelevant to users eyeing multiple petabytes of storage at less than $20,000 per terabyte.
There's also been a little push-and-shove surrounding Microsoft's new DatAllegro platform. It seems than Ingres had been working on adding a high-performance storage engine to the DatAllegro platform just before the company was acquired by Microsoft last year. Now, Ingres has launched the VectorWise Project to deliver such an engine to the open source community. The technology is derived from a research center known as Centrum Wiskunde and Informatica, some of whose researchers had a hand in developing the column-based MonetDB system a number of years ago.
Warehouse appliances are still a relatively new phenomenon, so it's likely that any faults, both major and minor, won't be apparent until deployments hit a critical mass. In the meantime, all we have to go on are the reports of early adopters and the fact that all the major vendors are convinced that the appliance model provides both superior servince and lower costs than any other approach.
But it's also a good bet that before too long, warehousing will be just another function to be outsourced to the cloud, and the determination as to which technology reigns supreme becomes someone else's problem.