Ever since the advent of the Watson supercomputer running on IBM Power processors, IBM has been on a campaign to increase the appeal of its Power processors.
While IBM has made a lot of progress in that regard, one of the lesser appreciated nuances of the IBM Power8 strategy is the inherent flexibility of the computing platform. Buried inside the latest generation of IBM Power8 processor is a new Coherent Accelerator Processor Interface (CAPI) that makes it possible to actually develop different classes on computer systems that share the same memory on top of a common “hollow core.” In effect, CAPI significantly improves performance by eliminating a lot of the overhead associated with operating systems and device drivers.
Fadi Gebara, senior manager and master inventor at IBM Research, says one of the primary benefits of having hollow-core architecture is that it makes it possible to deploy different classes of processor cores on a common Power8 platform. CAPI, for example, can treat NoSQL data running on Flash memory like traditional memory. The ability to support 192 threads on a Power8 processor means a NoSQL application could enjoy a 24-to-1 memory density advantage over x86 processors, says Gebara.
Thus far, IBM hasn’t done much more with CAPI than use it inside its own server lineup in increase performance. But Gebara says that IT organizations should watch this space. Members of the OpenPOWER Consortium will leverage CAPI to develop all kinds of extensions to the core Power8 platform, which will enable them to customize computing in ways that go far beyond what can be accomplished using general-purpose processors that are optimized primarily for desktop applications. In fact, Gebara says in the future, Power8 processors will develop “situational awareness” that enables them to run as servers or clients as the nature of the application workloads running on them dynamically changes.
All of this is now going to be possible because by using CAPI, the Power8 processor takes a fundamentally less complex approach to memory latency that is not only faster, but is also less complex. The end result is faster applications, but perhaps more importantly, applications that can more easily share data with one another because they share access to the same common pool of memory.