A common task for Azure administrators is to ensure that resources in their Azure subscriptions are configured accordingly to some business requirements or standards’ regulations. To help the administrators with this kind of tasks, Microsoft implemented a lot of built-in Azure Policies and policy sets, aka Policy Initiatives, that can audit Azure resources for security and redundancy best practices, regulatory compliance and even look into the internal of your VMs with Azure Policy's Guest Configuration.
All these built-in policies are a great source of inspiration for what you can achieve with this Azure service and how it can make your job much easier. They can be as simple as enforcing mandatory tags or specific resource naming patters, or more complex, which modify the configuration of resources or deploy missing parts. Today, however, let’s focus our attention on the auditing part only.
Azure Policy effect – ‘AuditIfNotExists’
The most used ‘Audit’ policy effect is pretty simple to understand as it only creates a warning event if a resource is not compliant. It has no additional parameters or configuration options and is limited to the condition of the policy rule. The ‘AuditIfNotExists’ condition is more interesting as it can verify the condition of related components for the resources that match ‘if’ policy rule part. So, as in the official example, you can check whether the virtual machine extension resource exists alongside VM resource, and the extension has specific attributes that identify it as Antimalware one.
So, let’s take action and consider a practical task when you need to ensure that your Azure SQL databases have proper retention policy configurations to be compliant with an internal backup policy, which, by the way, is a good thing to define and ensure in any infrastructure.
Example of auditing the configuration of Azure SQL database retention policies
Azure SQL databases are a convenient relational database-as-a-service solution that allows you to focus more on the development of your applications and to spend less effort on its maintenance. However, they still require suitable operational tasks, such as regular backups.
Although Microsoft implemented such a cool feature as automated backups for Azure single/pooled databases, which is enabled by default, for business-critical systems, it is common to require backups for much longer periods. So, usually, for production databases, you should have both short-term and long-term retention policies configured accordingly to your backup policy requirements.
Also, from an operational perspective, the databases you manage might have different criticality for your business continuity. Some of them can be highly critical, meaning that you can go out from business if you lose them, others will have no or little value and can go offline without any significant impact on core business operations.
Apart from that, in IT Service Operation, it is a widespread practice to define those criticality levels and assign all services in operation with one of the levels. This way, you can define operational service requirements such as SLO, RTO, RPO, etc. in connection to the service criticality levels and not for each application individually so that you have fewer rules to run them. Putting new services in operation also becomes easier as you don’t have to develop a new set of rules each time and can just agree on service criticality with business people.
Imagine that you have three service criticality levels: high, medium and low; and you must ensure that the Azure SQL databases related to your services have proper retention policy settings to ensure compliance with the backup policy for those different criticalities.
So, first of all, you should have some information about the criticality level assigned to your databases. The most apparent approach to achieve that is to use Azure metadata tags and assign each database a specific tag with defined criticality level value. There are plenty of ways to do that: create that tag during deployment from the ARM template, create an Azure Policy to deny the deployment of resources without a specific tag or to inherit the corresponding tag with its value from the resource group level. The latter approach might be especially preferable if you use resource groups as logical boundaries for your applications or services and define the criticality for the whole service and not its individual parts, as I mentioned previously.
Next, when you have all your production databases marked with criticality level tags, you can create custom Azure policies to audit their backup settings.
For example, you can use the following Azure Policy rule snippet for auditing the configuration of Azure SQL database short-term retention policy:
Or, the following snippet for the long-term retention policy:
Both examples contain the ‘if’ rule condition check for an Azure SQL database which has a specific criticality value and the ‘details’ section of the ‘AuditIfNotExists’ condition which evaluates the existence of specific retention policy resources.
As you might notice, I provided tag value and retention settings as policy parameters to make the same policy definitions reusable for different criticality levels. Also, you should pay attention to the logical evaluations in the ‘AuditIfNotExists’ condition: the rule will create warning events for all resources that don’t satisfy the ‘existenceCondition.’
To make the usage of such policies more convenient, you can group them in policy initiatives and assign them to target scopes. For example:
When designing your policy initiatives, take a note, that their parameters should be provided as an object type in order for your solution to work. Besides, if you define your custom Azure policies and initiatives in ARM templates as I do, remember to compose their parameters files accordingly to pass objects.
For full code samples, check out my repository for Azure Policies on GitHub.
What is more
Despite the relative simplicity, the mechanics of the ‘AuditIfNotExists’ condition can be very powerful and utilized for almost any configuration scenario. Basically, you can verify the configuration of almost all Azure resource types and create a solid governance framework for your Azure subscriptions. Considering that you can assign Azure Policies not only to the subscriptions but to the Management groups or individual resource groups, this technic creates really impressive possibilities for managing your Azure resources with greater efficiency and transparency.
Member discussion: