Patch Management: Building the Process

Successful patch management requires the formation of a robust process to ensure timely and accurate application and security fixes within an IT environment. Throughout 2004/05, patch-management processes will mature and merge into other key support processes and related process activities.
Written by Mark Vanston, Contributor

Successful patch management requires the formation of a robust process to ensure timely and accurate application and security fixes within an IT environment. Throughout 2004/05, patch-management processes will mature and merge into other key support processes and related process activities.

META Trend: Through 2006-08, IT operations organizations will begin standardizing platform-agnostic, sustaining work activities, as well as the integration points between these activities, bolstering IT organizations' ability to enhance performance across the IT delivery life cycle and reporting/improvement activities (e.g., quality, cost reduction, reporting structures). By 2006, 60% of the Global 2000 will deploy first-generation process bundles (e.g., change/production acceptance, customer advocacy, asset management) as virtual centers of excellence, formalizing/tuning these process groups as required through 2008.

In 2003, the collective number of security and software fixes released reached a never-before-seen level. During 2004, IT organizations (ITOs) will continue to be flooded with a similar - if not greater - number of patches, putting the onus on security and operations staff to implement an effective patch-management process enabled by point automation tools. Patch management may never be an inherently proactive task (i.e., vulnerabilities can be patched only after they are discovered). However, most ITOs continue to react after the inevitable attacks instead of taking action before the attack (i.e., upon notice of the initial vulnerability and publication of the remedial patch). We recommend that ITOs move away from manual postattack patch-management practices. Instead, a patch-availability-driven security and operational process should be implemented - one enabled by automation tools to quickly assess, collect, and distribute patches as they become available.

For most organizations, current patch-management processes are ad hoc and manual at best, which typically exposes the corporation to security vulnerabilities and drains limited IT resources. However, by 2005, we believe 40% of ITOs will have implemented dedicated patch-management processes and point solutions on their servers, with that number growing to 75% and including both servers and endpoints (e.g., desktops, laptops) by 2007.

Currently, patch management involves inconsistent, manual, and impromptu collection, categorization, and distribution of the patches, a process that is reactive to software vendor publishing cycles. Often, to implement the patches, organizations use labor-intensive system updating (i.e., IT staff needing to literally “touch each box”). Due to this manual process and time/budgetary constraints, this results in a limited scope (i.e., most organizations, inundated just by the security fixes released by major software vendors, such as Microsoft, have the time and human resources to address only their “Windows” environment, leaving other server platforms and endpoints exposed). In addition, and of the most importance, is the lack of integration and validation testing within the process. Most organizations have no method of validating potential conflicts between patches and current software (most important is software that is not related to the patched software; for example, patching the Windows OS may affect an Oracle database).

These limiting factors often result in a patch-management schedule that runs on a quarterly basis at best. The significance is that some of these patches may already be several months old. More important, if some of the fixes being applied are security patches, then the organization has exposed known security vulnerabilities to outside attack for as long as three months. In highly distributed corporations, individual ITOs may not even perform this ad hoc patch management on the same schedule, resulting in patch levels that are never in sync across the enterprise. This can become a huge problem, particularly in times of disaster. Organizations trying to restore systems after a failure will have a longer restore process if backup servers are not in sync with the primary production environment.

We recommend a two-pronged approach to solving this problem: 1) defining and following a clearly documented, easily repeatable, and enterprisewide patch-management process; and 2) using a dedicated patch management tool to automate the majority of the tasks contained within that process. Fundamental to the success of the process is the need to integrate both security and operational policies and define and delegate responsibilities accordingly.

Given that the majority of current patch activities are prompted by security vulnerabilities, the process must define and delineate clear responsibilities for both the security and the operations groups in IT, as well the handoffs between them. Security staff should be responsible for identifying the risks associated with each patch while providing IT operations with a prioritized means of addressing those risks. (A key point is that each and every patch cannot be applied, because this would be too costly and time-consuming.) However, our research indicates that a communication breakdown between security and operations frequently leads to delays in the distribution of critical patches. Without urgency levels identified ahead of time by the security group, important patches may not receive the emergency treatment they should once they arrive in IT operations. Therefore, it is imperative that these two IT groups negotiate mutually agreed-on service levels for each prioritization (e.g., “security patches addressing ‘severe risk’ vulnerabilities supersede all current patches and updates in the queue and must be fully deployed on all servers within five days”).

Foundation of the Process

The initial work will focus on defining the scope of patch process. For instance, will it cover endpoint or server hardware (e.g., servers, desktops, laptops, mobile devices), endpoint or server software (e.g., operating systems, applications), or network devices (e.g., routers, switches, hubs)? The patch management process should be as focused as possible to be complementary to change- and configuration-management processes. An efficient and straightforward process will also hasten its adoption among distributed IT groups. The basic steps involved in the process are inventorying the operating systems and applications that exist in the enterprise, collecting the patches applicable to that software, and installing them on only the systems requiring the patches.

Before using a process for any future patches, organizations need to determine the current state of their environment and obtain an accurate and granular (i.e., patch-level) inventory data for the systems being supported. For example, knowing how many instances of Windows 2000 Server are running and on which servers will not tell IT administrators the specific security patches or bug fixes that may or may not have been applied to that machine, or when they have been applied. Typically, ITOs engage in a manual collection of all known and available patches for that software, a process that can be enhanced by using automation tools. Organizations then need to take a proactive stance and research what patches, updates, and fixes are imminent and prioritize emergency versus standard patches.

The Patch Process

This is the process that organizations should use for any new patch releases. After initial design and implementation, many steps in this process should be automated to gain maximum efficiency and minimize cost and downtime.

Initial Analysis

  • Risk analysis: This is owned by security group (security should have negotiated policies that specify how these priorities map to operational requirements for each environment and should have documented the domain boundaries of the various environments). Patches should be ranked using a system similar to the following; this will in turn enable ITOs to understand prioritization and level of integration testing needed later on in the process:
    -Severe risk: Will have severe business impact, threat cannot be avoided, all necessary systems must be patched, and policy must be strictly enforced.
    -High risk: Threat is significant and patch is highly recommended, but there may be isolated cases where it is unnecessary or too complex.
    -Medium risk: Real threat, recommendation to patch, but can be overridden on a case-by-case basis.
    -Low risk: Not an imminent threat or it is one that will have minimal impact to systems; in some cases, it may not be worth the effort to patch; again, this should be examined on a case-by-case basis.

  • Determine the cost of compliance: This is owned by the operations group (operations should have committed - or at least sought a budget for - compliance to policy. Operations should estimate the cost of that compliance before the policy is finalized so that the CIO can determine whether the policies are reasonable). The following are some of the variables that should be included in the cost analysis:
    -Time needed to apply each patch
    -Required downtime, if any
    -Required resources to implement the patch
    -Prerequisites to enable a successful installation (e.g., prior patch versions)
    -Possible compatibility issues
  • Each technology domain will need a person who determines whether the patch is a “go” or “no go” for that particular environment. Typically, organizations use a server domain (e.g., Windows, Unix, Linux, legacy).
  • Inherent within every decision should be the necessary balance of risk and cost, because not all patches can be applied:
    -Common policy should be that patches that rate as “severe risk” must be applied and this step may be skipped.
  • Verify that designated patches function in the test environment. Typically due to time/budgetary constraints, most patches at the desktop level are tested against one or two key functions and then released. More mission-critical platforms (e.g., Solaris) require more rigorous testing. The severity of the risk ultimately determines the level of testing.
  • Schedule the prerequisites and implementation schedule.
  • Determine whether an outage is necessary and when the best time is to bring the system down.
  • Prerequisites and co-requisites (potential for automation), are taken care of.
  • Installation of patch (either automated or manual).
  • Confirm that the patched systems are working properly and that the patch is successfully installed across the entire domain.
  • Update the configuration database.
  • Operations needs to report back to security regarding the status of the patch.
Our research indicates that the most difficult phases of the process are the analysis and testing of new patches. Almost every step in the patch-management process - except testing, which should remain an internal discipline due to the unique natures of individual organizations’ environments - can be rapidly enabled through automation tools. Any capability for automation in the process is critical to reducing the cost and effort associated with patching (e.g., execution of the patch implementation is possible to automate and is recommended). Due to the immaturity of the patch-management market, users should not base the process on a vendor tool, but should instead use tools for automation and efficiency. The following are some of the tools on the market:
  • Dedicated patch-management tools: These support both endpoint and server software. Examples include St. Bernard Update Expert, PatchLink Update, BigFix Patch Manager, Shavlik HFNetChkPro, Gravity Storm Service Pack Manager, and Ecora Patch Manager. Typically, these tools concentrate on the Windows environment, but over the next 12-18 months they will broaden their scope to encompass most major platforms.
  • Server automation/management tools: These offer broader OS support and additional management capabilities. Examples include Opsware, BladeLogic, CenterRun, and Consera. These are still lacking in true patch-management capabilities but during the next six months will incorporate basic functionality.
  • Electronic software distribution and management tools: These perform more traditional application distribution and inventory functions. Examples include Novadigm, Marimba, Novell ZENworks, and Microsoft SMS. As with server automation/management tools, most of these products lack true patch-management capabilities (e.g., robust knowledgebase, integration testing) but will evolve rapidly to incorporate these characteristics.
We recommend that clients currently without any server/endpoint management or software distribution/inventory tools in place consider products with built-in patch-management capabilities. Although these patch capabilities are still immature, users should expect to see rapid advancement in feature/functions in the next 12 months. However, clients seeking only to augment their traditional management or distribution solutions would be better served by implementing a dedicated tool for the task. These solutions will prove less expensive than broader tools while providing less overlap with existing management products. More significantly, they are able to automate not only the targeting, distribution, and installation of patches, but also the aggregation of new patches as they are made available by software vendors.

Critical Patch-Management Integration Points

Process integration points communicate the inputs and outputs for each process and how they inter-relate (e.g., integration points among incident management, change management, and service requests). Process definitions and workflows are certainly a great starting point in developing consistent and repeatable work patterns. Well-defined patch-management processes operating as isolated processes (i.e., limited knowledge sharing across processes) do not yield significant process performance returns and therefore typically result in failure. Through 2005, significant cross-process integration analysis will enable process success rates of more than 30% over traditional efforts, by enhancing process information exchanges.

The goal of patch management is to ensure that, when the need for fixes/patches has been identified, they are effectively prioritized, managed, scheduled, implemented, etc., and that all modifications to the configuration of the environment are recorded and managed. Because patch management is inherently a collection of tasks from other processes, it is critical to effectively manage integration points between those processes (e.g., configuration management, monitoring, change management, production acceptance/control).

Business Impact: Inefficient patch management can have serious impacts on an organization’s ability to provide service to its customer base.

Bottom Line: Successful patch management requires a robust process that uses core competencies from other processes, such as change and configuration management.

META Group originally published this article on 19 November 2003.

Editorial standards