Organizations that use prepackaged ERP/CRM, custom, and third-party applications are seeing their production databases grow exponentially. At the same time, business policies and regulations require them to retain structured and unstructured data indefinitely. Storing increasing amounts of data on production systems is a recipe for poor performance no matter how much hardware is added or how much an application is tuned. Organizations need a way to manage this growth effectively.
Over the past few years, the Storage Networking Industry Association (SNIA) has promoted the concept of Information Lifecycle Management (ILM) as a means of better aligning the business value of data with the most appropriate and cost-effective IT infrastructure — from the time information is added to the database until it can be destroyed.
While the SNIA defines what an ILM system should accomplish, it does not specify any particular technology for implementing application ILM. According to Informatica, archiving is one approach that can be particularly effective — if organizations follow archiving best practices to ensure the optimal management of data during its life cycle.
This slideshow features Informatica’s best-practices approach for implementing application ILM archiving.
Click through for nine best practices for effective ILM archiving, as identified by Informatica.
As organizations grow, adjust their business strategies, or undergo mergers and acquisitions, their data volumes expand and storage requirements change. To plan their archiving strategy most effectively, organizations need visibility into the resulting data growth trends.
A best-practice archiving solution will include tools to enable the organization to evaluate where data is currently located as well as which applications and tables are responsible for the most data growth. Organizations must perform this evaluation on an ongoing basis to continually adjust their archiving strategy as necessary and maximize the ROI for these archiving efforts.
To define the most appropriate archiving strategy, organizations must determine their objectives. Some organizations will emphasize performance, others space savings, still others will specifically need to meet regulatory requirements. Examples of archiving goals may include:
- Improve response time for online queries to ensure timely access to current production data.
- Shorten batch processing windows to complete before the start of routine business hours.
- Reduce time required for routine database maintenance, backup, and disaster recovery processes.
- Maximize the use of current storage and processing capacity and defer the cost of hardware and storage upgrades.
- Meet regulatory requirements by purging selected data from the production environment and providing secure read-only access to it.
- Archive before upgrade to reduce the outage window required by the upgrade.
Once an organization understands its environment and success criteria, it must classify the different types of data it wishes to archive. As one example, in a general ledger module, an organization may decide to classify data as balances and journals. In an order management module, an organization may classify data into different types of orders such as consumer orders or business orders or perhaps orders by business unit.
Organizations can then create data retention policies that specify criteria for retaining and archiving each classification of data. These archiving policies must take into account data access patterns and the organization’s need to perform transactions on data. For example, a company may choose to keep one year of industrial orders from an order management module in the production database, while choosing to keep only six months of consumer order data in the production database. Another example is an organization could choose to keep nine months of data for its U.S. business unit while at the same time keeping three months of information for its U.K. operations, which could be dictated by different policies for accepting returns.
Data retention policies must also maintain consistency across modules, where appropriate. For example, when archiving a payroll module, organizations will want to coordinate retention policies with those of the benefits module because data for both of these modules is likely to contain significant interdependencies. Another example of the requirement is to have a consistent data retention policy that involves the inventory, bill of materials, and work in process modules across a typical manufacturing organization.
The archiving solution an organization chooses must therefore be flexible enough to accommodate separate retention policies for different data classifications and to enable them to modify these policies as requirements change.
The number one concern for organizations implementing a data growth management solution is to ensure the integrity of the business application. Thus, the process of archiving must take into account the business context of the data as well as relationships between different types of data. Data management is rendered even more complex because transactional dependencies are often defined at the application layer rather than the database layer. This means that a data growth management tool cannot simply reverse engineer the data model at the time of implementation. And any auto-discovery process is bound to be insufficient because it will miss all of the relationships embedded in the application. These rules and relationships can become quite complicated in large prepackaged products, such as Oracle E-Business Suite, PeopleSoft Enterprise, and Siebel CRM, which may have tens of thousands of database objects and a large number of integrated modules.
Successfully archiving data in these solutions requires an in-depth understanding of how the application defines a database object — that is, where the data is located and what structured and unstructured data needs to be related — and the set of rules that operate against the data. Most in-house developers have a difficult time reverse engineering the data relationships in complex applications. A best-practices archiving solution includes prepackaged business rules that incorporate an in-depth understanding of the way a particular enterprise solution stores and structures data. By choosing a solution with prepackaged rules, organizations save the time and effort of determining which tables to archive.
Since not every ERP or CRM customer runs all of its applications the way the vendor envisions, an archiving solution must also allow organizations to modify and customize the prepackaged archiving business rules. For example, despite the fact that a primary business rule may not allow the archiving of recurring invoices, a custom archiving rule does allow recurring invoices to be archived when all of the recurring invoices in an invoice template are archivable. A best-practices solution should include a graphical developer toolkit that resembles standard database design tools and makes it easy to modify the prepackaged archiving rules.
Once the organization has developed business rules, it needs to test them by simulating what will happen when data is actually archived. A best-practices solution provides simulation reporting that shows database administrators exactly how many records a given archiving policy will remove from the production system and how many will remain because the ERP classifies them as an exception.
Many organizations will want to control which users access historical data in the archive and which data access method — screen, report, or query — they can use to access the data. A best-practices solution will allow organizations to configure user access policies that specify which users are authorized to access historical data and which reports they are able to use.
A restoration capability functions as an insurance policy should specific transactions need to be modified after archiving. Only by having such a restoration capability can most organizations convince business users that it is safe to implement an archiving solution.
No organization wants its implementation — no matter how customized — to be on the bleeding edge of experimentation. It wants to be sure that the vendor it works with has seen and addressed the types of challenges likely to arise during an implementation. Therefore, organizations should choose a vendor that has developed an implementation methodology for complex archiving solutions that meets the outlined business objectives and has been successfully applied over a large number of implementations.