Access management in Azure is a broad and complex topic consisting of many interconnected parts, including Azure Active Directory, the Role-based Access Control (RBAC) model, resource permissions, service-specific access configurations, etc. Here, I would like to give you a few examples of how you can improve resource access management in Azure as a part of your Cloud Governance strategy.
The inputs
Regardless of organization size or the number of deployed Azure resources, I regularly encounter the following common scenario:
An organization has multiple Azure resources, providing access to different teams and individuals to manage those resources according to company goals—business as usual, as you might say. However, the way that access is provided is often quite far from operational ideals. Permissions are granted by creating Azure Role assignments for user and group accounts. The assignments are created manually at the resource group and resource levels. No unified access model exists, and the user experience is inconsistent. At best, access requests are submitted via an ITSM system and processed centrally by IT operations, creating historical records for security audits. At worst, users just granted Owner permissions on resource groups and manage access in whatever way they like.
Azure built-in Owner and User Access Administrator roles allow their members to manage access and assign roles in Azure RBAC.
Apart from the apparent disadvantages of the described approach, such as manual work, inconsistency, and delays in getting access, it also contains many hidden pitfalls. For instance, you can quickly exceed the limit of 2000 role assignments per subscription and won’t be able to grant new accesses. The users are likely to be granted insufficient or excessive permissions. In the latter case, overprivileged users with owner permissions might create or edit role assignments independently, bypassing the regular access management routine.
So, can we design a solution that does not require granting application owners permission to manage resource access by uncontrollably creating new Azure role assignments while being user-friendly and efficient?
Option #1: A minimum investment
Let’s start with minimalistic improvements to the initial design. As you might notice from the diagram above, the two significant changes in the access management process are granting access at the resource group level only and automating role assignment provisioning.
Firstly, granting access at the resource group level will help to simplify your access control lists (ACL) and reduce the number of role assignments. Sure, you won’t get the same granular access level as when creating role assignments at the resource level. Still, it will be a reasonable tradeoff in most cases, especially when you design your resource groups according to the resource lifecycle principle, creating independent application services.
Secondly, automating role assignment creation will allow you to save the time you spend doing it manually. It can be implemented as an Azure Automation runbook, a Logic App, an Azure Function, or any other technical means that can invoke Azure REST APIs. The automation app will require input from a resource group, an Azure role, and a user or group identity. I suggest programming the create, update and delete (CRUD) operations as separate functions to keep your automation logic clean and organized.
Finally, the process entry point can be defined as a request in your ITSM solution. To get access, users must fill out the request form and submit the ticket. A resource group or application owner must approve the request before the ITSM triggers the automation.
Option #2: Tightening the ropes
This option evolves from Option # 1. The key difference with the predecessor is that instead of dynamically creating new role assignments, we will add users to the security groups with predefined permissions. That will come to the rescue in large environments where Azure subscriptions contain hundreds of resource groups, and you might hit the limit of 2000 role assignments per subscription quite fast. Apart from that, all resource access will be managed via the security groups, which security administrators usually prefer.
That process design uses two automated steps. The first automation is triggered when a new resource group is created (for instance, you can use an Activity log alert for that). It will generate several security groups and provision role assignments for them using predefined Azure roles (built-in or custom). Those security groups can be either cloud-based or synced from your on-premise AD domain. The second automation adds/deletes users from those access groups. The process initiation and access approval steps remain the same.
Of course, such a design has its drawbacks. First, it’s more complicated and requires more effort to implement compared to Option # 1. Second, provisioning predefined access groups for your Azure resource groups creates a bunch of security groups that might never be used. Besides, if you use on-premise groups for access provisioning, you should count on the synchronization delays between your AD controller and Azure Active Directory.
You can definitely discard creating the access groups upon resource group creation and come up with automation that will create them on demand when an access request is created. However, that approach will require creating more complex automation logic: verifying existing access groups, creating a new access group (if needed) as part of the access request flow, and finally adding users to that group.
Option #3: A self-service kiosk
The third option is self-service. It can be a good choice for non-production environments and teams with full ownership over their applications, aka “you build it, you run it.”
The automation part is involved only when a resource group is created. It creates several predefined security groups and role assignments for them, similar to Option # 2. In addition to creating those predefined groups, the automation also sets their owner, who is the actual resource group (or application) owner approving access to the resources. By doing so, the application owner can manage the membership of those security groups.
When users need access to resources in a specific Azure resource group, they should reach out to the resource group owner about that. The resource group owner has enough permission to add the user to the security group, granting sufficient access to the resources without involving any third party. In many cases, the resource group owner and the person requesting access work on the same team, thus making the whole process much faster and more efficient as those teammates don’t have to submit support tickets and rely on other teams to fulfill their needs.
As the resource group owner technically doesn’t have Owner permissions over the resource group, they cannot directly modify role assignments on the resource group. They will have to use the predefined security groups to manage access. That will keep the structure and the number of role assignments in a subscription organized and auditable.
Final notes
Careful readers might notice that I omitted some aspects of the described designs. Indeed, for the sake of simplicity, I didn’t go into detail about the required internals of ITSM solutions or get information about resource/application owners, as that can be managed in different ways.
It is highly likely that the approvals in different ITSM systems will have some internal configuration nuances that depend on your system and its configuration.
If you store the information about resource owners in Azure tags, as I described in my posts about running CMDB for Azure, you can efficiently utilize it in your automation scripts. If that information is stored somewhere else, you must find your way to get it.
I hope those sample process designs will provide some ideas of how you can improve access management for Azure resources in your environments. Put your question and thoughts about it into the comments if you would like to know more about that topic 👇