Azure Policy Guardrails for Enterprise Landing Zones
A detailed guide to designing Azure Policy guardrails that improve security and governance without slowing delivery.
This article supports Practical Azure Landing Zone Design for Secure Enterprise Platforms by expanding the policy and governance layer.
Azure Policy is one of the most important tools in a landing zone. It turns governance from a document into a repeatable control plane. When used well, it helps teams deploy safely. When used badly, it blocks delivery, creates exceptions, and makes engineers work around the platform.
The goal is not to deny everything. The goal is to make the safe path the default path.
Current Microsoft guidance to anchor the design
Microsoft’s Azure landing zone design principles identify policy-driven governance as a core landing zone principle. Azure Policy is recommended as the mechanism for guardrails, auditing, enforcement, and control across Azure control plane and data plane configurations.
Microsoft’s Azure landing zone design areas also treat governance as a dedicated design area and emphasise automated auditing and enforcement of governance policies.
Key references:
- Azure landing zone design principles: https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/design-principles
- Azure landing zone design areas: https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/design-areas
- Azure Policy overview: https://learn.microsoft.com/en-us/azure/governance/policy/overview
What policy guardrails should do
Policy guardrails should help the platform answer these questions:
- Are resources deployed in approved locations?
- Are required tags present?
- Are risky public exposure patterns blocked or audited?
- Are diagnostic logs enabled?
- Are secure configuration settings enforced?
- Are workload teams using approved resource types?
- Are platform standards applied consistently?
- Are exceptions visible and time-bound?
A policy guardrail should be understandable. If engineers cannot understand why a deployment was denied, the guardrail will slow delivery and increase support load.
Policy modes and effects
Azure Policy supports different effects. The choice of effect matters.
Common effects include:
AuditDenyAppendModifyDeployIfNotExistsAuditIfNotExistsDisabled
Each effect has a different operational impact.
Audit first where impact is uncertain
Use Audit when introducing a new policy into an existing environment or when you are not sure how many workloads will be affected.
Audit-first policies help you answer:
- How many resources are non-compliant?
- Which teams are affected?
- Is the policy definition correct?
- Are there legitimate exceptions?
- Will deny mode break deployments?
After the impact is understood, move selected controls to Deny, Modify, or DeployIfNotExists.
Deny only where the standard is mature
Deny is useful for controls where the organisation has a clear standard and a supported alternative.
Good candidates for deny policies:
- Disallow unapproved regions
- Block public IP creation except by exception
- Prevent insecure storage configuration
- Block resource types that are not supported
- Restrict specific high-risk SKUs
Poor candidates for early deny policies:
- New tagging standards that are not yet adopted
- Diagnostic settings that have not been tested for every resource type
- Controls with many valid exceptions
- Settings where deployment tooling is inconsistent
If a deny policy causes teams to bypass the platform, the control has failed operationally even if it is technically correct.
DeployIfNotExists for platform hygiene
DeployIfNotExists can automatically deploy related configuration when a resource is created.
Good use cases include:
- Diagnostic settings
- Security configuration extensions
- Monitoring agents where appropriate
- Default configuration resources
Be careful with permissions. A policy assignment with remediation requires a managed identity with enough access to deploy the missing configuration.
Also test the resource provider behaviour. Some resources have diagnostic categories or configuration patterns that differ across services.
Modify and Append
Modify and Append can help standardise configuration. They are useful but should be tested carefully.
Examples:
- Add required tags from parent scope
- Enforce TLS settings
- Adjust allowed configuration values
- Add missing metadata
Do not use modify effects to hide poor deployment practices. Teams should still understand the required configuration.
Policy assignment strategy
Policy should be assigned at the right scope.
A practical assignment model:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Tenant Root Group
└── Broad baseline audit controls
Platform Management Group
└── Platform-specific controls
Landing Zones Management Group
└── Common workload controls
Production Management Group
└── Stricter production controls
Non-Production Management Group
└── Flexible audit and hygiene controls
Sandbox Management Group
└── Cost, expiry, and public exposure controls
This model keeps policy intent close to the environment it governs.
Baseline policy categories
A strong starting baseline includes these categories.
Location controls
Location controls restrict or audit where resources can be deployed.
Use these when region placement matters for latency, support, cost, or organisational requirements.
Design notes:
- Start with approved regions.
- Consider global services separately.
- Document why a region is approved or blocked.
- Avoid hard-coding assumptions without review.
Tagging controls
Tags support ownership, cost reporting, lifecycle, and operations.
Recommended baseline tags:
1
2
3
4
5
6
7
Application
Environment
Owner
CostCentre
ManagedBy
Criticality
ExpiryDate
Use audit or append first. Move to deny only when deployment tooling reliably applies tags.
Public exposure controls
Public exposure should be intentional.
Policy can audit or deny:
- Public IP creation
- Public network access on selected PaaS services
- Storage account public access
- Key Vault public network access
- Container registry public access
- Database services exposed publicly
Do not block all public exposure blindly. Some workloads genuinely require internet-facing endpoints. The platform should provide approved ingress patterns.
Diagnostic logging controls
Diagnostic settings are essential for operations and investigation.
Policy can deploy or audit diagnostic settings for:
- Key Vault
- Storage accounts
- Firewalls
- Network security groups
- Public IPs
- Application gateways
- SQL services
- Kubernetes clusters
- Container registries
Use central log destinations where appropriate.
Resource type controls
Restrict resource types that the platform does not support.
This reduces risk and operational sprawl.
Examples:
- Preview services not approved for use
- Services without monitoring support
- Services that create unmanageable cost risk
- Services not supported by the operations team
Resource type restrictions should be reviewed periodically. Azure changes quickly.
Initiative design
Group related policies into initiatives.
Examples:
- Baseline security initiative
- Logging and monitoring initiative
- Tagging initiative
- Network exposure initiative
- Production workload initiative
- Sandbox control initiative
Initiatives make assignment easier and help reporting.
Avoid giant initiatives that mix unrelated controls. If an initiative is too broad, exceptions become messy.
Parameterisation
Use parameters for values that change between environments.
Examples:
- Allowed locations
- Required tags
- Log Analytics workspace resource ID
- Approved private DNS zones
- Allowed SKUs
- Effect mode: audit or deny
Parameterisation allows the same policy definition to work across scopes without copying definitions.
Exemptions and exceptions
Exceptions must be governed.
A good exemption record includes:
- Policy or initiative name
- Resource or scope
- Owner
- Reason
- Risk acceptance
- Expiry date
- Compensating controls
- Approval reference
Never create permanent exceptions without periodic review.
Policy as code
Policy should be managed as code.
The policy repository should include:
1
2
3
4
5
6
7
policy/
├── definitions/
├── initiatives/
├── assignments/
├── exemptions/
├── parameters/
└── tests/
Use pull requests for policy changes. This makes changes reviewable and auditable.
Testing policy changes
Policy changes should be tested before production assignment.
Recommended process:
- Validate JSON or Bicep/Terraform syntax.
- Assign in a test scope.
- Run sample deployments.
- Review compliance results.
- Confirm remediation behaviour.
- Move to audit in non-production.
- Move to production after review.
- Move to deny only after impact is understood.
Operational reporting
Policy compliance should be reviewed regularly.
Useful views:
- Compliance by management group
- Compliance by subscription
- Compliance by policy initiative
- Non-compliant resources by owner
- Exemptions expiring soon
- Deny events by team
- Remediation task status
Policy reporting should be linked to action. A dashboard with no owner does not improve governance.
Common mistakes
Too many deny policies too early
This creates friction and leads to workarounds.
No exception process
Without a process, exceptions become hidden platform behaviour.
Duplicated policy assignments
Duplicate assignments create confusing compliance results.
Poorly named policies
Names should describe intent, not just technical implementation.
No remediation ownership
If a policy reports non-compliance, someone must own fixing it.
Recommended implementation plan
Start with this sequence:
- Inventory current policies.
- Define baseline categories.
- Create initiatives.
- Assign in audit mode.
- Review compliance results.
- Fix deployment templates.
- Add remediation where safe.
- Move selected controls to deny.
- Establish exception review.
- Report compliance monthly.
Final recommendation
Azure Policy should make governance repeatable, visible, and fair.
The best policy model is one that platform teams can maintain, workload teams can understand, and operations teams can act on.
Policy is not just enforcement. It is one of the main ways the landing zone communicates what good looks like.