Live cloud production for enterprises

All we need is a few details then we'll send the ebook direct to your inbox.

Try Grabyo

All we need is a few details from you and then a member of the team will be in touch!

Try Grabyo

All we need is a few details from you and then a member of the team will be in touch!

Building a Microservice Lightning-Fast with AWS SAM

This blog post is a follow-up to my previous blog post, Enforcing IAM least privilege with AWS SAM + Airtable. I will shift the focus of this blog post from the actual microservice, which is very simple, to the great benefits of using AWS SAM to build it.

I don’t write code regularly anymore. However, I built this service at a production-ready level using AWS SAM in a few days without much effort. Leaving the code aside, I’ll try to walk you through the steps I had to take to define the required supporting infrastructure using SAM and hopefully highlight how easy it was.

Why do I ❤️ SAM?

Before diving deep into the microservice, let’s cover some key features that make me like SAM.

Cloud deployments and sandbox environments

SAM allows you to deploy your full microservice in the cloud in seconds. During my tests with the HelloWorld example, SAM took about 10 seconds to deploy the entire stack (API GW + Lambda) into an AWS account. However, more importantly, you can create dedicated sandbox environments by naming your stack differently:

sam deploy --profile sam-dev --stack-name bob-iam-detective

This approach allows multiple developers to create a dedicated stack for their development and testing without interfering with other developers/stages.

 Live sync deployments

SAM relies on CloudFormation to deploy your stack. This approach translates to long waiting times (i.e. 10 seconds) when making micro changes. To solve this problem, AWS provides live sync deployments by using the following command:

sam sync --profile sam-dev --stack-name bob-iam-detective

The previous command syncs your application with your cloud deployment using direct AWS APIs instead of CloudFormation. This translates into near real-time updates to your stack when making changes to your code.

However, note that SAM only supports this feature on AWS Lambda, Amazon API Gateway, and AWS StepFunctions APIs.

Localhost development made it easy

The following SAM command launches your stack in a Localhost Docker container for testing. If you pair this environment with Localstack, you get an excellent environment for local development and testing.

sam local start-api # For example: you can invoke the hello world API like this curl http://127.0.0.1:3000/hello

CI/CD pipelines

Last but not least, SAM provides templates for CI/CD for a dual account setup (i.e. development & production). This CI/CD template would start on every commit to Git, run tests and deploy your application to both accounts. To initialize the pipeline, run the following command:

<code>sam pipeline init --bootstrap</code>

However, out of the box it would fail on the build phase unless you remove the line resolve_s3 = true from the samconfig.toml file:

[default.package.parameters] resolve_s3 = true

If you are new to SAM, I recommend looking at this SAM workshop from AWS.

Enabling testing on the pipeline

It’s important to note that the CI/CD template generated by SAM does not have any testing enabled by default. You must do the following:

  • In the file Codepipeline.yaml: Uncomment all the sections related to unit tests. The file has a few lines with the text “# Uncomment and modify the following step for running the unit-tests”.
  • In the same file Codepipeline.yaml: This might be only specific to my TypeScript setup. However, to make it work, I had to replace the existing Image (Image: aws/codebuild/amazonlinux2-x86_64-standard:3.0) with a later version (Image: aws/codebuild/amazonlinux2-x86_64-standard:5.0). The pipeline will work without this change.
  • In the file buildspec_unit_test.yml: I had to add the following code to run the tests for my TypeScript code:
version: 0.2 phases:   install:     runtime-versions:       nodejs: 18   pre_build:     commands:       - cd iam-detective/       - npm install   build:     commands:       # trigger the unit tests here       - echo 'Running unit tests'       - npm run test

The architecture

The primary responsibility of the backend is to read data from AWS and push it to Airtable. Concretely, these are the main tasks:

  1. List all the IAM roles from the current account running our lambda.
  2. For all the roles, create an Access Advisor report and push only the unused permissions (at Grabyo, we consider a permission “unused” after three months) to Airtable for manual review. We don’t send used permissions as they don’t need reviewing.

Because this logic only needs to run once daily, AWS Lambda was the best option. Subsequently, AWS SAM was the obvious choice.

However, because we have hundreds of roles in our AWS accounts, the whole process can’t run in a single lambda (the max execution time for a lambda at the time of writing is 15 minutes).

To solve this challenge, we decided to go for the following architecture:

The main components of this architecture are:

  • EventBridge Schedule: Executes the IAM inspector Lambda daily.
  • IAM Inspector Lambda: Assumes the role cross-account-role to list all the roles from all the different development or production accounts. For each of these roles, it sends a new job to an SQS queue.
  • Role Processor Lambda: For every job, it creates an Access Advisor report and sends the unused permissions results to Airtable.

The IAM inspector lambda

These are the general steps implemented on this lambda:

  • List all the roles from all the AWS accounts: The lambda is responsible for listing IAM roles from multiple accounts. We need a cross-account IAM role to allow it to access all the accounts.
  • Get all the existing roles in Airtable: We need to query this data to avoid notifying Airtable of permissions already in Airtable.
  • Push all non-existing roles to an SQS queue: We need to push all new unused permissions to an SQS for later processing.

To kickstart the project, I built the service using one of the SAM templates as a solid foundation to build upon. Concretely, I went for these options for the Hello World Example (a simple API gateway endpoint link to a lambda that responds with a Hello World test) on top of NodeJS18 with TypeScript and packaged as ZIP. 

I selected this stack for several reasons. TypeScript is a language I’m well-versed in, and it offers notably quick build and Lambda cold start times. Just for context, building this application in Node.js takes roughly 10 seconds, while its Java equivalent requires about 45 seconds. Furthermore, if you opt for an Image package deployment (i.e., Docker running on Lambda), these times increase significantly.

This is the final template.yaml, after renaming the hello-world application, it looks like this:

AWSTemplateFormatVersion: '2010-09-09' Transform: AWS::Serverless-2016-10-31 Description: >   iam-detective    Sample SAM Template for iam-detective  # More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst Globals:   Function:     Timeout: 60  Resources:   InvestigateIAMPermissionsFunction:     Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction     Properties:       Tags:         Service: iam-detective       CodeUri: iam-detective       Handler: InvestigateIAMPermissions.listAllIAMRoles       Runtime: nodejs18.x       Architectures:         - x86_64       Events:         IAMDetectiveInvestigateIAMPermissionsAPI:           Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api           Properties:             Path: /iam-detective             Method: get     Metadata: # Manage esbuild properties       BuildMethod: esbuild       BuildProperties:         Minify: false         Target: "es2020"         Sourcemap: true         EntryPoints:          - InvestigateIAMPermissions.ts  Outputs:   # ServerlessRestApi is an implicit API created out of Events key under Serverless::Function   # Find out more about other implicit resources you can reference within SAM   # https://github.com/awslabs/serverless-application-model/blob/master/docs/internals/generated_resources.rst#api   InvestigateIAMPermissionsFunctionApi:     Description: "API Gateway endpoint URL for IAM Detective"     Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/iam-detective/"   InvestigateIAMPermissionsFunction:     Description: "Investiage IAM Permissions Lambda Function ARN"     Value: !GetAtt InvestigateIAMPermissionsFunction.Arn   InvestigateIAMPermissionsFunctionIamRole:     Description: "Implicit IAM Role created for Investiage IAM Permissions function"     Value: !GetAtt InvestigateIAMPermissionsFunctionRole.Arn 

The previous template contains a lambda with an API gateway that should be removed as it doesn’t have any authentication attached and it still misses a few critical components for the lambda to perform its tasks.

I recommend creating the CI/CD pipeline at this stage as it will allow you to regularly merge your changes to the infrastructure and code into Git and deploy the changes automatically in the cloud. I like to do this regularly after I add a new feature to ensure the system works in the cloud and not only locally.

SQS queue

The lambda must list all IAM roles and send them to a queue. Defining this queue together with a dead-letter queue on the template can be achieved like this:

 # This is an SQS queue with all default configuration properties. To learn more about the available options, see   # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sqs-queues.html   InvestigateIAMPermissionsQueue:     Type: AWS::SQS::Queue     Properties:       VisibilityTimeout: 901 # Max lambda timeout + 1 second       RedrivePolicy:         deadLetterTargetArn:           Fn::GetAtt: [InvestigateIAMPermissionsDeadLetterQueue, Arn]         maxReceiveCount: 3    InvestigateIAMPermissionsDeadLetterQueue:     Type: AWS::SQS::Queue ... Outputs: ...   InvestigateIAMPermissionsQueue:     Description: "The SQS queue to communicate the two lambda functions"     Value: !Ref InvestigateIAMPermissionsQueue   InvestigateIAMPermissionsDeadLetterQueue:     Description: "The dead letter SQS queue to send messages that can't be processed"     Value: !Ref InvestigateIAMPermissionsDeadLetterQueue

We can use environment variables to pass the SQS URL to the lambda. To do this, we need to include the following code inside the Lambda properties definition inside the template:

 Properties:       Environment:         Variables:           SQS_URL_TO_PROCESS_IAM_ROLES: !Ref InvestigateIAMPermissionsQueue

This is the TypeScript code required to pull that URL from the environment variables:

<code>const sqsURL = process.env.SQS_URL_TO_PROCESS_IAM_ROLES ?? "ERROR";</code>

Permissions 

For the lambda function to perform its tasks, it requires the following permissions:

Cross-account access role:

At Grabyo, we have different AWS accounts. For this reason, the lambda needs to list all roles in the different AWS accounts. We have decided to create a cross-account access role in all the accounts and allow the lambda to assume all these roles. This is the CloudFormation for this cross-account IAM role:

AWSTemplateFormatVersion: "2010-09-09" Resources:   IAMDetectiveCrossAccountAccessRole:     Type: "AWS::IAM::Role"     Properties:       RoleName: "cross-account-access"       AssumeRolePolicyDocument:         Version: "2012-10-17"         Statement:           - Effect: "Allow"             Principal:               AWS:                 - "arn:aws:iam::XXXXXXXXXXXX:root"             Action: "sts:AssumeRole"       Policies:         - PolicyName: "IAMDetectiveCrossAccountAccessRolePermissions"           PolicyDocument:             Version: "2012-10-17"             Statement:               - Effect: "Allow"                 Action:                   - "iam:GenerateServiceLastAccessedDetails"                   - "iam:ListRoles"                   - "iam:GetServiceLastAccessedDetails"                 Resource: "*"  Outputs:   IAMDetectiveCrossAccountAccessRoleArn:     Value: !GetAtt IAMDetectiveCrossAccountAccessRole.Arn

Using the aws cli, you can create this IAM role in all the accounts needed using this command:

aws cloudformation create-stack    --stack-name detective-cross-account-access-role   --template-body file://detective-cross-account-access.yaml   --capabilities CAPABILITY_NAMED_IAM

Airtable access:

This lambda uses the information stored in Airtable for performance optimization reasons before it submits new roles for processing. To access Airtable, the lambda needs to use a Personal access token. We are storing this token in SecretsManager using the following CloudFormation template:

Parameters:   AirtableSecret:     Type: String     NoEcho: true  Resources:   IamDetectiveAirtableSecret:     Type: AWS::SecretsManager::Secret     Properties:       Name: airtable-token       Description: Secret for IAM Detective Airtable integration       SecretString: !Sub '{"airtable_token":"${AirtableToken}"}'  Outputs:   IAMDetectiveAirtableTokenARN:     Description: "The airtable secret for the IAM detective microservice."     Value: !Ref IamDetectiveAirtableSecret     Export:       Name:          Fn::Sub: "${AWS::StackName}-ARN"

Using the aws cli, you can create this secrets manager stack and provide the secret at the time of creation by using this command:

aws cloudformation create/update-stack    --stack-name airtable-secret    --template-body file://secrets-manager-airtable.yaml   --parameters ParameterKey=AirtableSecret,ParameterValue=XXXXXX   --capabilities CAPABILITY_NAMED_IAM

Opsgenie access:

We use Opsgenie in the backend service to check the health of the service. We use Opsgenie heartbeats to do this. Again, this requires an API token that we store on SecretsManager. Here is the CloudFormation template:

Parameters:   OpsgenieSecret:     Type: String     NoEcho: true  Resources:   IamDetectiveOpsgenieSecret:     Type: AWS::SecretsManager::Secret     Properties:       Name: opsgenie-token       Description: Secret for IAM Detective Opsgenie integration       SecretString: !Sub '{"opsgenie_token":"${OpsgenieToken}"}'  Outputs:   IAMDetectiveOpsgenieTokenARN:     Description: "The opsgenie secret for the IAM detective microservice."     Value: !Ref IamDetectiveOpsgenieSecret     Export:       Name:          Fn::Sub: "${AWS::StackName}-ARN"

Lambda IAM permissions:

We need to grant access to the lambda to perform all of these actions by adding the following to the lambda policy on the SAM template:

Policies:   - Statement:       - Sid: Stmt1679505932243         Effect: Allow         Action:           - sts:AssumeRole         Resource:           [             arn:aws:iam::XXXXXXXXXXXX:role/iam-detective-cross-account-access,             arn:aws:iam::XXXXXXXXXXXX:role/iam-detective-cross-account-access,             ...           ]       - Sid: Stmt1679384123196         Effect: Allow         Action:           - sqs:SendMessage         Resource: !GetAtt InvestigateIAMPermissionsQueue.Arn       - Sid: Stmt1679253132905         Effect: Allow         Action:           - secretsmanager:GetSecretValue         Resource:           - Fn::ImportValue:               Fn::Sub: "${AirtableSecretsStackName}-ARN"           - Fn::ImportValue:               Fn::Sub: "${OpsgenieSecretsStackName}-ARN"

This policy will grant access to assume the cross-account role and access to both secrets for Airtable and Opsgenie.

Cronjob

Finally, the service needs a daily cronjob to kickstart the lambda. To define this cronjob in the SAM template, you must add the following code inside the events of the lambda function (next to the API gateway event).

Events: ...   IAMDetectiveInvestigateIAMPermissionsCron:     Type: Schedule     Properties:       Schedule: "cron(0 1 * * ? *)"       Description: This is the cron job for the IAM Detective - Invetigate IAM permissions daily.       Enabled: True

Please note that multiple events firing the lambda means it can be called from the API gateway or the cronjob independently.

The role processor lambda

Taking into account that this lambda will be run once per role in the SQA queue, these are the steps implemented on this lambda:

  • Generate a ServiceLastAccessedReport: To find out unused permissions, we first need to generate a service last accessed report and then download the report generated.
  • Update Airtable: With the report, we need to iterate over all the permissions in the role, identify unused ones (i.e., older than three months), and send them to Airtable.

This is the code required to include the lambda on the template:

# This is the Lambda function definition associated with the source code: sqs-payload-logger.js. For all available properties, see # https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction SQSIAMProcessor:   Type: AWS::Serverless::Function   Properties:     Tags:       Service: iam-detective     CodeUri: iam-detective     Handler: SQSIAMProcessor.sqsIAMProcessorHandler     Runtime: nodejs18.x     Architectures:       - x86_64     Description: A Lambda function that logs the payload of messages sent to an associated SQS queue.      # This property associates this Lambda function with the SQS queue defined above, so that whenever the queue     # receives a message, the Lambda function is invoked     Events:       SQSQueueEvent:         Type: SQS         Properties:           Queue: !GetAtt InvestigateIAMPermissionsQueue.Arn           BatchSize: 10           Enabled: true           ScalingConfig:             MaximumConcurrency: 2     MemorySize: 128     Timeout: 900     Policies:       - Statement:           - Sid: Stmt1679505932243             Effect: Allow             Action:               - sts:AssumeRole             Resource:               [                 arn:aws:iam::XXXXXXXXXXXX:role/cross-account-access,                 ...,               ]           - Sid: Stmt1679505932905             Effect: Allow             Action:               - secretsmanager:GetSecretValue             Resource:               - Fn::ImportValue:                   Fn::Sub: "${AirtableSecretsStackName}-ARN"   Metadata: # Manage esbuild properties     BuildMethod: esbuild     BuildProperties:       Minify: false       Target: "es2020"       Sourcemap: true       EntryPoints:         - SQSIAMProcessor.ts

Invoking event:

You can see that the only event invoking this lambda is the SQS queue we defined earlier compared to the cronjob and API events defined for the previous lambda. 

It’s worth pointing out that For this lambda, we need to limit the number of concurrent executions down to the minimum (2 at the time of writing). This is required because the lambda IAM Inspector lambda will generate hundreds of requests for processing, and we can’t process them all in parallel because we would get API rate-limiting requests from Airtable. This is achieved by adding the field MaximumConcurrency: 2 in the previous template.

Lambda IAM permissions:

Regarding permissions, the lambda requires the same cross-account access role as it needs to generate the IAM reports and access the same Airtable secret to perform updates of the newly found permissions.

The final template

This is the final template, including all infrastructure:

AWSTemplateFormatVersion: "2010-09-09" Transform: AWS::Serverless-2016-10-31 Description: >   iam-detective    Sample SAM Template for iam-detective  # More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst Globals:   Function:     Timeout: 60  Parameters:   AirtableSecretsStackName:     Description: Name of the airtable secrets manager stack.     Type: String     Default: "airtable-token"   OpsgenieSecretsStackName:     Description: Name of the opsgenie secrets manager stack.     Type: String     Default: "opsgenie-token"  Conditions:   IsDevAccount: !Equals [!Ref AWS::AccountId, "630843564847"]  Resources:   InvestigateIAMPermissionsFunction:     Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction     Properties:       Tags:         Service: iam-detective       CodeUri: iam-detective       Handler: InvestigateIAMPermissions.listAllIAMRoles       Runtime: nodejs18.x       Environment:         Variables:           SQS_URL_TO_PROCESS_IAM_ROLES: !Ref InvestigateIAMPermissionsQueue       Architectures:         - x86_64       Policies:         - Statement:             - Sid: Stmt1679505932243               Effect: Allow               Action:                 - sts:AssumeRole               Resource:                 [                   arn:aws:iam::XXXXXXXXXXXX:role/cross-account-access,                   ...                 ]             - Sid: Stmt1679384123196               Effect: Allow               Action:                 - sqs:SendMessage               Resource: !GetAtt InvestigateIAMPermissionsQueue.Arn             - Sid: Stmt1679253132905               Effect: Allow               Action:                 - secretsmanager:GetSecretValue               Resource:                 - Fn::ImportValue:                     Fn::Sub: "${AirtableSecretsStackName}-ARN"                 - Fn::ImportValue:                     Fn::Sub: "${OpsgenieSecretsStackName}-ARN"       Events:         IAMDetectiveInvestigateIAMPermissionsCron:           Type: Schedule           Properties:             Schedule: "cron(0 1 * * ? *)"             Description: This is the cron job for the IAM Detective - Invetigate IAM permissions daily.             Enabled: True     Metadata: # Manage esbuild properties       BuildMethod: esbuild       BuildProperties:         Minify: false         Target: "es2020"         Sourcemap: true         EntryPoints:          - InvestigateIAMPermissions.ts    # This is an SQS queue with all default configuration properties. To learn more about the available options, see   # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sqs-queues.html   InvestigateIAMPermissionsQueue:     Type: AWS::SQS::Queue     Properties:       VisibilityTimeout: 901 # Max lambda timeout + 1 second       RedrivePolicy:         deadLetterTargetArn:           Fn::GetAtt: [InvestigateIAMPermissionsDeadLetterQueue, Arn]         maxReceiveCount: 3    InvestigateIAMPermissionsDeadLetterQueue:     Type: AWS::SQS::Queue    # This is the Lambda function definition associated with the source code: sqs-payload-logger.js. For all available properties, see   # https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction   SQSIAMProcessor:     Type: AWS::Serverless::Function     Properties:       Tags:         Service: iam-detective       CodeUri: iam-detective       Handler: SQSIAMProcessor.sqsIAMProcessorHandler       Runtime: nodejs18.x       Architectures:         - x86_64       Description: A Lambda function that logs the payload of messages sent to an associated SQS queue.        # This property associates this Lambda function with the SQS queue defined above, so that whenever the queue       # receives a message, the Lambda function is invoked       Events:         SQSQueueEvent:           Type: SQS           Properties:             Queue: !GetAtt InvestigateIAMPermissionsQueue.Arn             BatchSize: 10             Enabled: true             ScalingConfig:               MaximumConcurrency: 2       MemorySize: 128       Timeout: 900       Policies:         - Statement:             - Sid: Stmt1679505932243               Effect: Allow               Action:                 - sts:AssumeRole               Resource:                 [                   arn:aws:iam::XXXXXXXXXXXX:role/cross-account-access,                   ...                 ]             - Sid: Stmt1679505932905               Effect: Allow               Action:                 - secretsmanager:GetSecretValue               Resource:                 - Fn::ImportValue:                     Fn::Sub: "${AirtableSecretsStackName}-ARN"     Metadata: # Manage esbuild properties       BuildMethod: esbuild       BuildProperties:         Minify: false         Target: "es2020"         Sourcemap: true         EntryPoints:           - SQSIAMProcessor.ts  Outputs:   InvestigateIAMPermissionsFunction:     Description: "Investigate IAM Permissions Lambda Function ARN"     Value: !GetAtt InvestigateIAMPermissionsFunction.Arn   InvestigateIAMPermissionsFunctionIamRole:     Description: "Implicit IAM Role created for Investigate IAM Permissions function"     Value: !GetAtt InvestigateIAMPermissionsFunctionRole.Arn   InvestigateIAMPermissionsQueue:     Description: "The SQS queue to communicate the two lambda functions"     Value: !Ref InvestigateIAMPermissionsQueue   InvestigateIAMPermissionsDeadLetterQueue:     Description: "The dead letter SQS queue to send messages that can't be processed"     Value: !Ref InvestigateIAMPermissionsDeadLetterQueue

Final thoughts

In this journey of creating a robust microservice, we’ve uncovered the remarkable capabilities of AWS SAM. The ability to deploy, test, and maintain cloud-native applications at lightning speed is a game-changer. The power of AWS SAM, combined with your expertise, opens the door to endless possibilities for innovation and efficiency in your development process.

As you embark on your own AWS SAM adventures, remember that the cloud is your playground, and AWS SAM is your ultimate tool. Keep experimenting, keep building, and enjoy the speed and agility that AWS SAM brings to your development projects. Happy coding!

We’re hiring!

We’re looking for talented engineers in all areas to join our team and help us to build the future of broadcast and media production.

Scroll to Top