Enforcing IAM least privilege with AWS SAM + Airtable
Grabyo has been an AWS partner for a number of years now. One of the requirements to be an AWS partner is to pass their FTR (Foundational Technical Review). An FTR is a checklist of requirements you must fulfil to be an AWS partner. One of these requirements is:
IAM-008 — Audit identities quarterly.
Auditing the identities that are configured in your identity provider and IAM helps ensure that only authorized identities have access to your workload. For example, remove people that leave the organization, and remove cross-account roles that are no longer required. Have a process in place to periodically audit permissions to the services accessed by an IAM entity. This helps you identify the policies you needto modify to remove any unused permissions. For more information, see Refining permissions in AWS using last accessed information.
However, following this recommended practice is easier said than done. AWS recommends using their Access Advisor tool to identify permissions no longer required.
At the time of writing, we have 846 roles across all our AWS accounts. Without a better system, we would need to:
- Ask our engineers to check Access Advisor for all the roles regularly. Let’s say every quarter.
- They would need to produce a report in Access Advisor manually.
- Validate against our code base if the unused permissions are still required.
While doable, I don’t recommend doing it due to the following reasons:
- Covering all roles is challenging. We must cross-reference all roles with our Terraform code in Git to determine the right ownership.
- There is enormous potential for optimizing this process and reducing the manual work done by our engineers. For example:
- A permission to send an SNS alert if the service is down might not have been used in the last three months but may still be needed.
- This scenario is excruciating because this permission will need a re-review in the next quarter (assuming there is no service downtime), wasting precious engineering time.
- Since the process is implemented quarterly, there is a potential for almost a three-month delay in detecting unused permissions, keeping our platform less secure.
The ideal solution
To improve this process, we decided to build a new service that should fulfil these requirements:
- We want to assign roles to owners (teams) responsible for reviewing all unused permissions.
- Once an unused permission for a role is detected, the appropriate team must be notified to take immediate action.
- If the permission is not required, the team must remove it from the role.
- If the permission is still required, the team should be able to mark it as needed, and the alerting should not notify about it again.
- To help with governability, we want dashboards containing near real-time information about all team roles and unused permissions.
- We want to be able to define a threshold for considering a permission unused.
- Initially, we have decided that an unused permission is one that has not been used in the last three months (we will refer to this term for the rest of the document). However, we would like to adjust this value in the future easily.
Introducing IAM Detective
To enhance the efficiency of our permission management process, we’ve introduced a new internal service known as IAM Detective. However, the significance of this blog post extends beyond this service. I find the technologies we’ve employed truly remarkable, and I’m eager to share my experience because they’ve enabled me, someone who isn’t a developer regularly, to construct an entirely new service from scratch in just a matter of days. These technologies are:
AWS SAM (AWS Serverless Application Model)
We have chosen to use AWS SAM because we believe it is the fastest way to create backend serverless microservices on top of AWS lambda. The framework provides numerous tooling out of the box that makes developing on top of it extremely fast and efficient.
Airtable
The Airtable Connected Apps Platform™ empowers teams to build custom apps on top of shared data.
Due to the simple and powerful APIs and data representations, we have used Airtable as our main data store. Airtable stores its data similarly to traditional SQL databases. This data representation is what makes Airtable really powerful. By default, the data gets represented as traditional tables on a spreadsheet. However, you can create other views like List, Kanban, Calendar, etc. You can achieve further customization with additional filters. I recommend having a look here for more details.
On top of this, Airtable provides two key features that make it the perfect choice for this service:
- It provides interfaces which are the perfect tool for creating dashboards.
- It provides IFTTT automation and simple JavaScript scripting to perform automated tasks.
- Integrates with different cloud services, which makes expandability very easy. For example, we use Airtable to send alerts to Opsgenie.
IAM Detective architecture
As you can see, the architecture of this service is pretty simple. The backend service runs daily and is built on top of AWS SAM with a combination of lambdas that list all IAM roles and create access detail reports for all of them. The backend will only update Airtable for roles with unused permissions over 3 months.
You can read all about this backend in this blog post Building a Microservice Lightning-Fast with AWS SAM. The rest of the blog post will focus on Airtable and how we use it to solve the service requirements.
Data model
The previous data model diagram has 3 main tables (in red) with different attributes. The backend service manages the blue attributes, and they are a direct representation of AWS attributes. The red attributes are managed by engineers on Airtable.
- Roles: The list of all roles in the platform that contain unused permissions. This table contains the attribute Owner (the team responsible for reviewing all unused permissions associated with that role).
- Unused permissions: The list of unused permissions associated with the roles and permissions tables. This table contains the attribute Status used to drive logic within Airtable. Concretely:
- To Review: An engineer must review the permission (this is the default status on creation).
- Accepted: The unused permission is still required even if it hasn’t been used in the last 3 months.
- To Delete: The unused permission is unnecessary, and an engineer will shortly remove it from the role.
- Permissions: The list of existing permissions in AWS for IAM roles. This list is auto-generated by the backend service.
Dashboards
The backend will fill Airtable with many roles and unused permissions the first time it runs. It’s unavoidable that the team goes through all the roles first to assign owners. Once owners are assigned, all engineers must thoroughly sweep all unused permissions to mark them as Accepted or ToDelete.
Using Airtable’s interface designer, we can create interfaces in minutes to help with this process. The following dashboard is great for governance as it allows us to understand the overall status of the process and which teams have completed it.
Airtable can be difficult to navigate for new users, and it’s easy to make mistakes. For this reason, we have created a second interface to simplify the review process for our engineers:
This interface has a simple set of instructions for review and only two actions for each unused permission:
- Keep: In this case, the engineer must indicate why it is required in case of future auditing. Once the permission is marked as accepted, the system won’t alert you about it again.
- Delete: In this case, it’s expected that the engineer removes the permissions in the AWS role. Once this permission is deleted from the role, the backend will remove it from the database.
Alerting
Alerting teams when new unused permissions are found is a key requirement for this new service. We can use Airtable automations to solve this challenge.
This automation has the following steps:
- The trigger is a scheduled event that executes the automation daily at 4:00 a.m. BST.
- The second step searches for all the roles that contain new unused permissions that haven’t been alerted before.
- The last step sends an email to an Opsgeine endpoint to generate an alert for every role found.
I think this IFTTT UI from Airtable is pretty intuitive, and you can easily understand it by playing with it. However, I think it’s worth covering the logic implemented in Airtable to ensure that:
- We don’t generate individual alerts for unused permissions that belong to the same role.
- We don’t resend alerts for unused permissions daily (i.e. if the permission was not addressed).
To accomplish this behaviour, we have created some additional fields in Airtable:
Alert in the UnusedPermissions table:
<code>IF(DATETIME_DIFF(NOW(), Created, 'days') < 2, "ALERT", "IGNORE")</code>
This formula attribute writes a string ALERT if the unused permission is created in Airtable in the last 24 hours or IGNORE if it’s older than that. This is the key logic to avoid sending repetitive alerts.
AlertLookup in the Roles table:
This Lookup attribute will join the combined value of all the values from the previous Alert attribute in the UnusedPermissions table. The end result looks like this:
<code>IGNORE, IGNORE, ALERT, IGNORE</code>
Alert Role in the Roles table:
<code>IF(FIND("ALERT", {AlertLookup (from UnusedPermissions)}), "ALERT", "IGNORE")</code>
This formula attribute searches the string “ALERT” on the previous attribute AlertLookup and writes ALERT if found or IGNORE otherwise.
Using the attribute Owner, we can create an automation per team to route the alerts to the right team.
Alert routing
As mentioned before, Opsgenie is the entry point for all alerts and notifications in the platform. Opsgenie is responsible for routing them to the right destination based on their severity.
Opsgenie notifies the engineering rota of the most urgent alerts. However, because these notifications are not that urgent, it create JIRA issues for the different teams to address on the next working day/sprint, and it notifies engineers on Slack so they get instant notifications as well.
Closing comments
In conclusion, IAM Detective is a game-changer for AWS IAM permissions management. By combining AWS SAM and Airtable, we’ve created an efficient and proactive system that empowers teams to manage permissions effectively. IAM Detective aligns with AWS Well-Architected principles and simplifies governance with real-time dashboards and automated alerts. We’re excited about its future potential and invite you to explore the detailed architecture in the linked blog post. If you’re ready to enhance your AWS IAM management, IAM Detective is here to assist.
We’re hiring!
We’re looking for talented engineers in all areas to join our team and help us to build the future of broadcast and media production.