Computing technology is seemingly caught between a rock and a hard place these days. The rock is physics, which is preventing the CPU from generating significantly more power using Moore’s Law. The hard place is the fact that machine learning and other forms of artificial intelligence need far greater performance than CPUs can provide.
As we’ve seen many times in the past, however, these types of conundrums are illusory. From Edison to Jobs to Musk, countless innovators have shown that if you can’t work through a problem, work around it.
In this case, the workaround is accelerated computing. In a recent report, IDC defined AC as the practice of offloading key workloads to silicon subsystems like high-speed GPUs and flexible FPGAs. These multi-chip configurations are increasingly targeting the unstructured data workloads leveraged by artificial intelligence, advanced data analytics, cloud computing and scientific research. IDC has released a Worldwide Accelerated Compute Taxonomy, which provides a range of enterprise-focused hardware definitions, which the company says are crucial to the understanding of evolving “3rd Platform” architectures.
Two key developments in GPU offloading are Nvidia’s new Volta chip, which is rated at 120 teraflops and powers the company’s new DGX-1 supercomputer, and Google’s TPU2 device that provides about 45 teraflops. Both chips are accelerating the performance of AI-facing platforms like TensorFlow, with Google even tapping Nvidia’s CUDA parallel processing architecture for key workloads on the Google Cloud. For Nvidia’s part, the company is looking to accelerate every aspect of the AI process, offering optimized frameworks for key development ecosystems like Caffe, Chainer and Microsoft Cognitive Toolkit.
The advantages that GPU-accelerated databases bring to cutting-edge applications like cognitive computing are impressive, says eWeek’s Chris Preimesberger. For one thing, organizations can converge artificial intelligence and business intelligence workloads, giving traditional platforms a 10-to-100-fold boost in performance to advance real-time or near-real-time results. As well, GPUs are good at vector and matrix operations, which support AI models using live data from leading libraries like BIDMach and TensorFlow, and they can also simplify the use of advanced algorithms for business users by enabling access through standard REST APIs using point-and-click interfaces.
No matter how advanced the processing unit it, however, performance in multi-chip configurations is only as good as the interconnect. This is why Xilinx and IBM collaborated on a new PCIe Gen4 solution that doubles throughput over Gen3 solutions. The architecture provides 16 Gbps per lane between the Xilinx UltraScale FPGA and the IBM Power9. The design will be targeted at AI and analytics applications in the data center, as well as accelerated cloud computing. (Disclosure: I provide content services to IBM.)
While innovation has reset the bar for what’s possible throughout history, it also has had the unintended consequence of prompting mankind to immediately set a new bar. Think of the steam engine, the telephone and the PC. Today’s data acceleration technologies may augur what appears to be a new era in unlimited computing, but it won’t be long before the world starts to chafe under its limitations.
But this is as it should be. If mankind ever reaches a point at which there is nothing left to strive for, the very essence of being human will have lost something that is truly precious.
Arthur Cole writes about infrastructure for IT Business Edge. Cole has been covering the high-tech media and computing industries for more than 20 years, having served as editor of TV Technology, Video Technology News, Internet News and Multimedia Weekly. His contributions have appeared in Communications Today and Enterprise Networking Planet and as web content for numerous high-tech clients like TwinStrata and Carpathia. Follow Art on Twitter @acole602.