Skytap has become the latest provider of cloud computing services to make instances of an Apache Hadoop distribution available in the cloud. The decision that IT organizations will ultimately have to make is where to run those Big Data applications once they make their way to a production environment.
According to Brett Goodwin, vice president of marketing and business development for Skytap, the cloud service provider is partnering with Cloudera to provide IT organizations with a cloud service that will make it easy for them to build and test Hadoop applications. What’s not as clear, says Goodwin, is where those applications will actually run once they reach a production environment. The simple fact is that a lot of those Big Data applications may not be practical to run in the cloud because of the I/O performance issues associated with moving that amount of data across a wide area network.
Goodwin says it’s pretty clear that IT organizations need a way to come up to speed on Hadoop that doesn’t require them to invest in a lot of IT infrastructure. For that reason, Skytap has partnered with Cloudera to create instances of Hadoop running on a 10-node cluster in the cloud that Goodwin says can be spun up in as little as 10 minutes. As part of an effort to help drive development of those applications, the service itself is free up to 50 nodes, adds Goodwin.
But once an IT organization decides it has an application that will add value to the business, the decision on where to deploy that application will vary based on performance requirements. Some IT organizations may opt to deploy that Big Data application in the cloud because performance may not be a major consideration. But most IT organizations are likely to consider Big Data application performance to be a significant issue, which Goodwin says is one of the reasons Skytap developed support for hybrid cloud architecture running on top of VMware virtual machines that span Skytap servers in the cloud and on-premise systems in the enterprise.
Ultimately, one of the most critical decisions any IT organization can make about a Big Data application is where to host it. Once deployed, moving that application will prove to be impractical once it achieves any size. Worse yet, the cost of transferring data across wide area networks could wind up making the application overly expensive to deploy and manage. In other words, cloud computing depending on the application is not always as inexpensive as might initially seem.