Azure Update Manager patch wave planning template
Operator-ready planning template for Azure Update Manager patch waves covering scope, maintenance windows, reboot tolerance, exclusions, soak periods, rollback contacts, and stop-go criteria before scheduled patching.
Good For
- Monthly patch cycle planning for Azure VMs and Arc-enabled servers
- CAB or change-ticket preparation
- Patch wave sequencing across application tiers
- Documenting exclusions, reboot limits, and validation ownership
- Creating CSV-backed wave plans for reporting or import into tracking sheets
How to Use It
- 1. Define change scope and planning horizon: Record change ID, patch month, target subscriptions, regions, and whether the plan covers Azure VMs, Arc-enabled servers, or both. State the planning horizon: pilot, wave 1, wave 2, final wave, and any freeze dates. Identify the source of truth for server ownership and service criticality.
- 2. Build the patch wave inventory: List each server or server group with hostname, resource ID, OS, environment, application/service, owner, business criticality, and PatchGroup tag. Map every asset to a wave number and maintenance configuration or planned schedule. Mark excluded systems and capture reason, approver, and review date.
- 3. Record maintenance window and reboot constraints: For each wave, capture start time, time zone, maximum duration, concurrency limit, and expected patch install window. Document reboot tolerance: allowed, disallowed, manual approval required, clustered sequencing, or application-led restart. Note dependencies such as load balancer drain, SQL AG failover order, service stop/start checks, or app team attendance.
- 4. Define soak periods and stop-go criteria: Set minimum soak period between waves, for example 24 or 48 hours, and name the approver for progression. Define stop-go thresholds such as patch install failure rate, boot failure, failed health probe, monitoring alerts, or user-impact incidents. Document rollback or containment actions, escalation contacts, and whether the next wave pauses automatically pending review.
- 5. Assign validation and reporting owners: For each wave, assign platform validator, application validator, and business confirmer. Specify required evidence after each wave: compliance snapshot, reboot outcomes, failed machine list, and app health confirmation. State where the final report, CSV, and operator notes will be stored.
Execution Modes
- local
Inputs and Outputs
Inputs
- Change metadata - fields: Change ID, Patch cycle/month, Planned start date, Change manager, Primary operator
- Environment scope - fields: Subscription(s), Resource group(s), Region(s), Azure VM and/or Arc-enabled server scope, Environment tag values (Prod/Test/Dev)
- Wave plan CSV columns - fields: Wave, Hostname, ResourceId, SubscriptionId, ResourceGroup, Platform, OS, Environment, Service/Application, BusinessCriticality, PatchGroupTag, MaintenanceConfig, WindowStart, TimeZone, MaxDurationMinutes, RebootPolicy, ConcurrencyLimit, DependencyNotes, ValidationOwner, AppOwner, SoakHours, StopGoCriteria, ExclusionReason, ApprovalStatus, OperatorNotes
- Decision records - fields: Exclusions and approvals, Manual reboot exceptions, Service blackout periods, Escalation contacts, Rollback or containment notes
Outputs
- verbose-console
- operator-notes
Command Starter
Safe to run: read-only
az account show --output table
az resource list --tag PatchGroup --query "[].{name:name,resourceGroup:resourceGroup,type:type,location:location,tags:tags}" --output table
az connectedmachine list --query "[].{name:name,resourceGroup:resourceGroup,status:status,location:location,tags:tags}" --output table
az vm list --show-details --query "[].{name:name,resourceGroup:resourceGroup,powerState:powerState,location:location,tags:tags}" --output table
az maintenance configuration list --query "[].{name:name,resourceGroup:resourceGroup,location:location,maintenanceScope:maintenanceScope}" --output table
az graph query -q "Resources | where type =~ 'microsoft.compute/virtualmachines' or type =~ 'microsoft.hybridcompute/machines' | project name, type, resourceGroup, subscriptionId, location, tags" --output csvValidation
- Every in-scope server is mapped to exactly one wave or documented as excluded with approver and reason.
- Each wave has a maintenance window, time zone, reboot policy, and named technical/app owner.
- Dependencies and sequencing for clustered or multi-tier applications are recorded and reviewed by service owners.
- Stop-go criteria, soak period, and escalation path are explicit enough to decide whether the next wave proceeds.
Reporting
- Per-wave server counts by environment, service, and reboot policy.
- Excluded server list with reason, approver, and planned review date.
- Wave progression summary showing pilot completion, soak expiry, and go/no-go decision.
- Post-change evidence index: compliance export, failed machine list, application validation notes, and incident references.
Safety Notes
- This template is a planning aid; validate actual Azure Update Manager schedules and maintenance configurations before execution.
- Do not mix tightly coupled application tiers into the same wave unless dependency and rollback handling are explicitly documented.
- Use local business time and UTC in notes where teams span regions to avoid maintenance window ambiguity.
- Treat exclusions as temporary decisions; capture expiry or next review date so systems do not silently drift from patch coverage.