Prior to cloud, the provisioning process was often a lengthy (and bureaucratic) process that could take days if not weeks to provision new or additional capacity. The cloud deployment model and its supportive virtualization of technology, allows consumers - often using Web portals - to trigger their own provisioning. This automation eliminates the need for front-end IT involvement in provisioning. However, the cloud facility to support end-user triggered and fully automated provisioning can create a flood of new (and often unexpected) virtualized compute platforms and associated storage in hours rather than days and weeks.
While the new cloud deployment strategy may have removed IT from the front end of the process, it is absolutely critical that IT develops robust backend capabilities. It is critical that IT has the tools and techniques to understand exactly what is happening in the cloud. In the undisciplined environment, the speed with which a virtual server can be deployed, and the low cost of its use, can result in an increasing number of abandoned or orphaned virtual machines. This is typically seen in rapid development and test environments and in rapid deployment of market-facing programs. Cloud deployment optimizes time-to-market so we can reasonably expect this situation to grow. Let's look at some of the basic backend capabilities IT must develop and deploy if they are to retain the ability to cost effectively administer this new and dynamic infrastructure deployment.
Defining Cloud Services and Outlining Costs
Cloud is often referred to as a 'utility' service. In other words, it is always available like water or electricity and you simply pay for what you use. Some utilities, like telephone and cable TV, have a connection charge and you also pay a monthly fee for use whether you use it or not. However, if you never watch TV, you still get to pay your monthly fee. If you leave your lights on or the water running you will be charged for this use.
What does this mean for IT? To manage cloud services, cloud connection and cloud usage, IT must have a clear definition of the services it provides and a billing schedule covering the various services. In short, IT absolutely must have governance policies for the provision of cloud-based services. IT needs to ensure that these policies are rational, well-defined, consumer-socialized and executive-approved. An organization providing free services can exist only as a charity with generous donations. A commercially viable operation must charge appropriate rates for industry competitive services while striving for quality differentiation.
Once governance policies are established, IT must then seek out from the consumer community and from industry research, the type and nature of the desired cloud services. Once defined, these services can then be mapped to the most cost-effective supporting infrastructure designed to provide the service needs. The services and their attributes should be published in a catalog with each service carrying a price/cost-per-unit. The prudent IT team will also provide a service-level agreement as part of their contract with the consumer that spells out the guarantees and details of the services provided and mutual responsibilities.
The Provisioning Process
So now we know the services IT will make available, the cost, how consumers will choose those services and the guaranteed offering. The next step is provisioning, which is typically consumer-driven through Web-based automation. Provisioning (and more importantly, de-provisioning) is typically fully automated. What IT needs is a highly focused set of key performance indicators to understand in real time what is happening, while also ensuring their ability to support the service demand as it fluctuates. There are three major drivers for these metrics. The first is the ability to identify resource usage to support billing functions. The second is the ability to be able to report on service-level achievement to support quality of service. The third is the ability to track usage and incorporate this into an effective supply-and-demand model to ensure that IT can meet reasonably projected demand.
Transitioning to an Internal Cloud Capability
One of the key challenges for many IT organizations transitioning to an internal cloud capability is the move from a somewhat laissez-faire resource management approach (based on organizational knowledge and expertise) to the disciplines and detailed understanding of the infrastructure needed when services and resources are billed out, guaranteed and often deployed without IT involvement. At a minimum, the IT team contemplating internal cloud adoption should first ensure they have fully identified and inventoried their technology infrastructure.
The recommended approach here is to use an ITIL artifact called the configuration management database or CMDB. The CMDB holds all the entities in the IT infrastructure, including every server (physical or virtual), every storage tier or storage technology combination and every application. The CMDB provides capabilities to record these entities in a comprehensive and correlated database that supports three critical functions. First, it is essential to include any form of operational or disaster planning. Second, provide a disciplined foundation for evaluating and executing change management and release management. Third, provide a baseline for supply-and-demand utilization and projections at the unit of resource level. Examples include a GB of storage or GHz of compute power. The CMDB must have the ability to capture (from the provisioning process) those virtual and/or physical resources that have been assigned to that consumer. This is needed for quality control, billing, and supply and demand management.
The second key focus for metrics is the ability to understand and report on quality-of-service (QoS). Specifically, we need to prove that we have met the service levels offered in the service catalog and guaranteed in the service-level agreement. This means we need a set of metrics that allow us to make this determination both at the tier of service level (as chosen by the consumer from the service catalog) and at the individual bill-paying consumer level. Such metrics typically include availability, performance and backup completions, and sometimes extend to recover-point objectives (RPO) and recover-time objectives (RTO).
The third focus for metrics is the necessity to understand what resources are available and what has been used. The time interval selected for these metrics will vary based on the activity. In a high-activity environment, hourly or even real-time monitoring (with threshold alarms) may be appropriate. In a less-active environment, perhaps daily or even weekly would suffice. In any event, it is prudent to strike upper and lower thresholds on resource usage to trigger escalation alarms. In the longer term, these numbers and trends can be used to predict usage and ensure that purchasing cycles can be initiated in time to meet projected needs. The smaller sample set of an internal private cloud provider may mitigate against an effective supply projection capability compared to public cloud providers with a huge consumer base.
Attention to the issues described in this paper will avoid the embarrassment of losing control over your virtual machines and having a plethora of orphan virtual machines - each potentially claiming 20GB of storage. The ease of creating virtual machines in an organization driven by operational imperatives can result in a situation that has effectively taken resources off the table and hidden them away from future re-use.
It is important to keep in mind that virtualized and automatically provisioned storage can have a significant downside. Unless the storage is provided under a reasonable billing model, little incentive is provided for a cost-conscious use of this resource. While relatively cheap, storage can consume a significant investment in both people and infrastructure. While often related to the water utility model, data is permanently within the environment unless something is done about the data. The water utility provides a product that is consumed and/or discarded. Imagine if we had to keep in our basement every gallon of water we pulled from the faucet. Without the disciplines outlined in this paper, the undisciplined use of storage on demand may drive similar absurdities.
By ensuring that robust backend capabilities are built well ahead of the deployment of internal cloud services, IT can ensure that it can cost-effectively handle the dynamic fluctuations and demands inherent in the consumer-driven resource usage under the internal cloud model.