Try Grabyo

All we need is a few details from you and then a member of the team will be in touch!

Building a Microservice Lightning-Fast with AWS SAM

This blog post is a follow-up to my previous blog post, Enforcing IAM least privilege with AWS SAM + Airtable. I will shift the focus of this blog post from the actual microservice, which is very simple, to the great benefits of using AWS SAM to build it.

I don’t write code regularly anymore. However, I built this service at a production-ready level using AWS SAM in a few days without much effort. Leaving the code aside, I’ll try to walk you through the steps I had to take to define the required supporting infrastructure using SAM and hopefully highlight how easy it was.

Why do I ❤️ SAM?

Before diving deep into the microservice, let’s cover some key features that make me like SAM.

Cloud deployments and sandbox environments

SAM allows you to deploy your full microservice in the cloud in seconds. During my tests with the HelloWorld example, SAM took about 10 seconds to deploy the entire stack (API GW + Lambda) into an AWS account. However, more importantly, you can create dedicated sandbox environments by naming your stack differently:

sam deploy --profile sam-dev --stack-name bob-iam-detective

This approach allows multiple developers to create a dedicated stack for their development and testing without interfering with other developers/stages.

 Live sync deployments

SAM relies on CloudFormation to deploy your stack. This approach translates to long waiting times (i.e. 10 seconds) when making micro changes. To solve this problem, AWS provides live sync deployments by using the following command:

sam sync --profile sam-dev --stack-name bob-iam-detective

The previous command syncs your application with your cloud deployment using direct AWS APIs instead of CloudFormation. This translates into near real-time updates to your stack when making changes to your code.

However, note that SAM only supports this feature on AWS Lambda, Amazon API Gateway, and AWS StepFunctions APIs.

Localhost development made it easy

The following SAM command launches your stack in a Localhost Docker container for testing. If you pair this environment with Localstack, you get an excellent environment for local development and testing.

sam local start-api
# For example: you can invoke the hello world API like this
curl http://127.0.0.1:3000/hello

CI/CD pipelines

Last but not least, SAM provides templates for CI/CD for a dual account setup (i.e. development & production). This CI/CD template would start on every commit to Git, run tests and deploy your application to both accounts. To initialize the pipeline, run the following command:

<code>sam pipeline init --bootstrap</code>

However, out of the box it would fail on the build phase unless you remove the line resolve_s3 = true from the samconfig.toml file:

[default.package.parameters]
resolve_s3 = true

If you are new to SAM, I recommend looking at this SAM workshop from AWS.

Enabling testing on the pipeline

It’s important to note that the CI/CD template generated by SAM does not have any testing enabled by default. You must do the following:

  • In the file Codepipeline.yaml: Uncomment all the sections related to unit tests. The file has a few lines with the text “# Uncomment and modify the following step for running the unit-tests”.
  • In the same file Codepipeline.yaml: This might be only specific to my TypeScript setup. However, to make it work, I had to replace the existing Image (Image: aws/codebuild/amazonlinux2-x86_64-standard:3.0) with a later version (Image: aws/codebuild/amazonlinux2-x86_64-standard:5.0). The pipeline will work without this change.
  • In the file buildspec_unit_test.yml: I had to add the following code to run the tests for my TypeScript code:
version: 0.2
phases:
  install:
    runtime-versions:
      nodejs: 18
  pre_build:
    commands:
      - cd iam-detective/
      - npm install
  build:
    commands:
      # trigger the unit tests here
      - echo 'Running unit tests'
      - npm run test

The architecture

The primary responsibility of the backend is to read data from AWS and push it to Airtable. Concretely, these are the main tasks:

  1. List all the IAM roles from the current account running our lambda.
  2. For all the roles, create an Access Advisor report and push only the unused permissions (at Grabyo, we consider a permission “unused” after three months) to Airtable for manual review. We don’t send used permissions as they don’t need reviewing.

Because this logic only needs to run once daily, AWS Lambda was the best option. Subsequently, AWS SAM was the obvious choice.

However, because we have hundreds of roles in our AWS accounts, the whole process can’t run in a single lambda (the max execution time for a lambda at the time of writing is 15 minutes).

To solve this challenge, we decided to go for the following architecture:

The main components of this architecture are:

  • EventBridge Schedule: Executes the IAM inspector Lambda daily.
  • IAM Inspector Lambda: Assumes the role cross-account-role to list all the roles from all the different development or production accounts. For each of these roles, it sends a new job to an SQS queue.
  • Role Processor Lambda: For every job, it creates an Access Advisor report and sends the unused permissions results to Airtable.

The IAM inspector lambda

These are the general steps implemented on this lambda:

  • List all the roles from all the AWS accounts: The lambda is responsible for listing IAM roles from multiple accounts. We need a cross-account IAM role to allow it to access all the accounts.
  • Get all the existing roles in Airtable: We need to query this data to avoid notifying Airtable of permissions already in Airtable.
  • Push all non-existing roles to an SQS queue: We need to push all new unused permissions to an SQS for later processing.

To kickstart the project, I built the service using one of the SAM templates as a solid foundation to build upon. Concretely, I went for these options for the Hello World Example (a simple API gateway endpoint link to a lambda that responds with a Hello World test) on top of NodeJS18 with TypeScript and packaged as ZIP. 

I selected this stack for several reasons. TypeScript is a language I’m well-versed in, and it offers notably quick build and Lambda cold start times. Just for context, building this application in Node.js takes roughly 10 seconds, while its Java equivalent requires about 45 seconds. Furthermore, if you opt for an Image package deployment (i.e., Docker running on Lambda), these times increase significantly.

This is the final template.yaml, after renaming the hello-world application, it looks like this:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
  iam-detective

  Sample SAM Template for iam-detective

# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
  Function:
    Timeout: 60

Resources:
  InvestigateIAMPermissionsFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      Tags:
        Service: iam-detective
      CodeUri: iam-detective
      Handler: InvestigateIAMPermissions.listAllIAMRoles
      Runtime: nodejs18.x
      Architectures:
        - x86_64
      Events:
        IAMDetectiveInvestigateIAMPermissionsAPI:
          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
          Properties:
            Path: /iam-detective
            Method: get
    Metadata: # Manage esbuild properties
      BuildMethod: esbuild
      BuildProperties:
        Minify: false
        Target: "es2020"
        Sourcemap: true
        EntryPoints: 
        - InvestigateIAMPermissions.ts

Outputs:
  # ServerlessRestApi is an implicit API created out of Events key under Serverless::Function
  # Find out more about other implicit resources you can reference within SAM
  # https://github.com/awslabs/serverless-application-model/blob/master/docs/internals/generated_resources.rst#api
  InvestigateIAMPermissionsFunctionApi:
    Description: "API Gateway endpoint URL for IAM Detective"
    Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/iam-detective/"
  InvestigateIAMPermissionsFunction:
    Description: "Investiage IAM Permissions Lambda Function ARN"
    Value: !GetAtt InvestigateIAMPermissionsFunction.Arn
  InvestigateIAMPermissionsFunctionIamRole:
    Description: "Implicit IAM Role created for Investiage IAM Permissions function"
    Value: !GetAtt InvestigateIAMPermissionsFunctionRole.Arn

The previous template contains a lambda with an API gateway that should be removed as it doesn’t have any authentication attached and it still misses a few critical components for the lambda to perform its tasks.

I recommend creating the CI/CD pipeline at this stage as it will allow you to regularly merge your changes to the infrastructure and code into Git and deploy the changes automatically in the cloud. I like to do this regularly after I add a new feature to ensure the system works in the cloud and not only locally.

SQS queue

The lambda must list all IAM roles and send them to a queue. Defining this queue together with a dead-letter queue on the template can be achieved like this:

 # This is an SQS queue with all default configuration properties. To learn more about the available options, see
  # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sqs-queues.html
  InvestigateIAMPermissionsQueue:
    Type: AWS::SQS::Queue
    Properties:
      VisibilityTimeout: 901 # Max lambda timeout + 1 second
      RedrivePolicy:
        deadLetterTargetArn:
          Fn::GetAtt: [InvestigateIAMPermissionsDeadLetterQueue, Arn]
        maxReceiveCount: 3

  InvestigateIAMPermissionsDeadLetterQueue:
    Type: AWS::SQS::Queue
...
Outputs:
...
  InvestigateIAMPermissionsQueue:
    Description: "The SQS queue to communicate the two lambda functions"
    Value: !Ref InvestigateIAMPermissionsQueue
  InvestigateIAMPermissionsDeadLetterQueue:
    Description: "The dead letter SQS queue to send messages that can't be processed"
    Value: !Ref InvestigateIAMPermissionsDeadLetterQueue

We can use environment variables to pass the SQS URL to the lambda. To do this, we need to include the following code inside the Lambda properties definition inside the template:

 Properties:
      Environment:
        Variables:
          SQS_URL_TO_PROCESS_IAM_ROLES: !Ref InvestigateIAMPermissionsQueue

This is the TypeScript code required to pull that URL from the environment variables:

<code>const sqsURL = process.env.SQS_URL_TO_PROCESS_IAM_ROLES ?? "ERROR";</code>

Permissions 

For the lambda function to perform its tasks, it requires the following permissions:

Cross-account access role:

At Grabyo, we have different AWS accounts. For this reason, the lambda needs to list all roles in the different AWS accounts. We have decided to create a cross-account access role in all the accounts and allow the lambda to assume all these roles. This is the CloudFormation for this cross-account IAM role:

AWSTemplateFormatVersion: "2010-09-09"
Resources:
  IAMDetectiveCrossAccountAccessRole:
    Type: "AWS::IAM::Role"
    Properties:
      RoleName: "cross-account-access"
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: "Allow"
            Principal:
              AWS:
                - "arn:aws:iam::XXXXXXXXXXXX:root"
            Action: "sts:AssumeRole"
      Policies:
        - PolicyName: "IAMDetectiveCrossAccountAccessRolePermissions"
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: "Allow"
                Action:
                  - "iam:GenerateServiceLastAccessedDetails"
                  - "iam:ListRoles"
                  - "iam:GetServiceLastAccessedDetails"
                Resource: "*"

Outputs:
  IAMDetectiveCrossAccountAccessRoleArn:
    Value: !GetAtt IAMDetectiveCrossAccountAccessRole.Arn

Using the aws cli, you can create this IAM role in all the accounts needed using this command:

aws cloudformation create-stack 
  --stack-name detective-cross-account-access-role
  --template-body file://detective-cross-account-access.yaml
  --capabilities CAPABILITY_NAMED_IAM

Airtable access:

This lambda uses the information stored in Airtable for performance optimization reasons before it submits new roles for processing. To access Airtable, the lambda needs to use a Personal access token. We are storing this token in SecretsManager using the following CloudFormation template:

Parameters:
  AirtableSecret:
    Type: String
    NoEcho: true

Resources:
  IamDetectiveAirtableSecret:
    Type: AWS::SecretsManager::Secret
    Properties:
      Name: airtable-token
      Description: Secret for IAM Detective Airtable integration
      SecretString: !Sub '{"airtable_token":"${AirtableToken}"}'

Outputs:
  IAMDetectiveAirtableTokenARN:
    Description: "The airtable secret for the IAM detective microservice."
    Value: !Ref IamDetectiveAirtableSecret
    Export:
      Name: 
        Fn::Sub: "${AWS::StackName}-ARN"

Using the aws cli, you can create this secrets manager stack and provide the secret at the time of creation by using this command:

aws cloudformation create/update-stack 
  --stack-name airtable-secret 
  --template-body file://secrets-manager-airtable.yaml
  --parameters ParameterKey=AirtableSecret,ParameterValue=XXXXXX
  --capabilities CAPABILITY_NAMED_IAM

Opsgenie access:

We use Opsgenie in the backend service to check the health of the service. We use Opsgenie heartbeats to do this. Again, this requires an API token that we store on SecretsManager. Here is the CloudFormation template:

Parameters:
  OpsgenieSecret:
    Type: String
    NoEcho: true

Resources:
  IamDetectiveOpsgenieSecret:
    Type: AWS::SecretsManager::Secret
    Properties:
      Name: opsgenie-token
      Description: Secret for IAM Detective Opsgenie integration
      SecretString: !Sub '{"opsgenie_token":"${OpsgenieToken}"}'

Outputs:
  IAMDetectiveOpsgenieTokenARN:
    Description: "The opsgenie secret for the IAM detective microservice."
    Value: !Ref IamDetectiveOpsgenieSecret
    Export:
      Name: 
        Fn::Sub: "${AWS::StackName}-ARN"

Lambda IAM permissions:

We need to grant access to the lambda to perform all of these actions by adding the following to the lambda policy on the SAM template:

Policies:
  - Statement:
      - Sid: Stmt1679505932243
        Effect: Allow
        Action:
          - sts:AssumeRole
        Resource:
          [
            arn:aws:iam::XXXXXXXXXXXX:role/iam-detective-cross-account-access,
            arn:aws:iam::XXXXXXXXXXXX:role/iam-detective-cross-account-access,
            ...
          ]
      - Sid: Stmt1679384123196
        Effect: Allow
        Action:
          - sqs:SendMessage
        Resource: !GetAtt InvestigateIAMPermissionsQueue.Arn
      - Sid: Stmt1679253132905
        Effect: Allow
        Action:
          - secretsmanager:GetSecretValue
        Resource:
          - Fn::ImportValue:
              Fn::Sub: "${AirtableSecretsStackName}-ARN"
          - Fn::ImportValue:
              Fn::Sub: "${OpsgenieSecretsStackName}-ARN"

This policy will grant access to assume the cross-account role and access to both secrets for Airtable and Opsgenie.

Cronjob

Finally, the service needs a daily cronjob to kickstart the lambda. To define this cronjob in the SAM template, you must add the following code inside the events of the lambda function (next to the API gateway event).

Events:
...
  IAMDetectiveInvestigateIAMPermissionsCron:
    Type: Schedule
    Properties:
      Schedule: "cron(0 1 * * ? *)"
      Description: This is the cron job for the IAM Detective - Invetigate IAM permissions daily.
      Enabled: True

Please note that multiple events firing the lambda means it can be called from the API gateway or the cronjob independently.

The role processor lambda

Taking into account that this lambda will be run once per role in the SQA queue, these are the steps implemented on this lambda:

  • Generate a ServiceLastAccessedReport: To find out unused permissions, we first need to generate a service last accessed report and then download the report generated.
  • Update Airtable: With the report, we need to iterate over all the permissions in the role, identify unused ones (i.e., older than three months), and send them to Airtable.

This is the code required to include the lambda on the template:

# This is the Lambda function definition associated with the source code: sqs-payload-logger.js. For all available properties, see
# https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
SQSIAMProcessor:
  Type: AWS::Serverless::Function
  Properties:
    Tags:
      Service: iam-detective
    CodeUri: iam-detective
    Handler: SQSIAMProcessor.sqsIAMProcessorHandler
    Runtime: nodejs18.x
    Architectures:
      - x86_64
    Description: A Lambda function that logs the payload of messages sent to an associated SQS queue.

    # This property associates this Lambda function with the SQS queue defined above, so that whenever the queue
    # receives a message, the Lambda function is invoked
    Events:
      SQSQueueEvent:
        Type: SQS
        Properties:
          Queue: !GetAtt InvestigateIAMPermissionsQueue.Arn
          BatchSize: 10
          Enabled: true
          ScalingConfig:
            MaximumConcurrency: 2
    MemorySize: 128
    Timeout: 900
    Policies:
      - Statement:
          - Sid: Stmt1679505932243
            Effect: Allow
            Action:
              - sts:AssumeRole
            Resource:
              [
                arn:aws:iam::XXXXXXXXXXXX:role/cross-account-access,
                ...,
              ]
          - Sid: Stmt1679505932905
            Effect: Allow
            Action:
              - secretsmanager:GetSecretValue
            Resource:
              - Fn::ImportValue:
                  Fn::Sub: "${AirtableSecretsStackName}-ARN"
  Metadata: # Manage esbuild properties
    BuildMethod: esbuild
    BuildProperties:
      Minify: false
      Target: "es2020"
      Sourcemap: true
      EntryPoints:
        - SQSIAMProcessor.ts

Invoking event:

You can see that the only event invoking this lambda is the SQS queue we defined earlier compared to the cronjob and API events defined for the previous lambda. 

It’s worth pointing out that For this lambda, we need to limit the number of concurrent executions down to the minimum (2 at the time of writing). This is required because the lambda IAM Inspector lambda will generate hundreds of requests for processing, and we can’t process them all in parallel because we would get API rate-limiting requests from Airtable. This is achieved by adding the field MaximumConcurrency: 2 in the previous template.

Lambda IAM permissions:

Regarding permissions, the lambda requires the same cross-account access role as it needs to generate the IAM reports and access the same Airtable secret to perform updates of the newly found permissions.

The final template

This is the final template, including all infrastructure:

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: >
  iam-detective

  Sample SAM Template for iam-detective

# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
  Function:
    Timeout: 60

Parameters:
  AirtableSecretsStackName:
    Description: Name of the airtable secrets manager stack.
    Type: String
    Default: "airtable-token"
  OpsgenieSecretsStackName:
    Description: Name of the opsgenie secrets manager stack.
    Type: String
    Default: "opsgenie-token"

Conditions:
  IsDevAccount: !Equals [!Ref AWS::AccountId, "630843564847"]

Resources:
  InvestigateIAMPermissionsFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      Tags:
        Service: iam-detective
      CodeUri: iam-detective
      Handler: InvestigateIAMPermissions.listAllIAMRoles
      Runtime: nodejs18.x
      Environment:
        Variables:
          SQS_URL_TO_PROCESS_IAM_ROLES: !Ref InvestigateIAMPermissionsQueue
      Architectures:
        - x86_64
      Policies:
        - Statement:
            - Sid: Stmt1679505932243
              Effect: Allow
              Action:
                - sts:AssumeRole
              Resource:
                [
                  arn:aws:iam::XXXXXXXXXXXX:role/cross-account-access,
                  ...
                ]
            - Sid: Stmt1679384123196
              Effect: Allow
              Action:
                - sqs:SendMessage
              Resource: !GetAtt InvestigateIAMPermissionsQueue.Arn
            - Sid: Stmt1679253132905
              Effect: Allow
              Action:
                - secretsmanager:GetSecretValue
              Resource:
                - Fn::ImportValue:
                    Fn::Sub: "${AirtableSecretsStackName}-ARN"
                - Fn::ImportValue:
                    Fn::Sub: "${OpsgenieSecretsStackName}-ARN"
      Events:
        IAMDetectiveInvestigateIAMPermissionsCron:
          Type: Schedule
          Properties:
            Schedule: "cron(0 1 * * ? *)"
            Description: This is the cron job for the IAM Detective - Invetigate IAM permissions daily.
            Enabled: True
    Metadata: # Manage esbuild properties
      BuildMethod: esbuild
      BuildProperties:
        Minify: false
        Target: "es2020"
        Sourcemap: true
        EntryPoints: 
        - InvestigateIAMPermissions.ts

  # This is an SQS queue with all default configuration properties. To learn more about the available options, see
  # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sqs-queues.html
  InvestigateIAMPermissionsQueue:
    Type: AWS::SQS::Queue
    Properties:
      VisibilityTimeout: 901 # Max lambda timeout + 1 second
      RedrivePolicy:
        deadLetterTargetArn:
          Fn::GetAtt: [InvestigateIAMPermissionsDeadLetterQueue, Arn]
        maxReceiveCount: 3

  InvestigateIAMPermissionsDeadLetterQueue:
    Type: AWS::SQS::Queue

  # This is the Lambda function definition associated with the source code: sqs-payload-logger.js. For all available properties, see
  # https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
  SQSIAMProcessor:
    Type: AWS::Serverless::Function
    Properties:
      Tags:
        Service: iam-detective
      CodeUri: iam-detective
      Handler: SQSIAMProcessor.sqsIAMProcessorHandler
      Runtime: nodejs18.x
      Architectures:
        - x86_64
      Description: A Lambda function that logs the payload of messages sent to an associated SQS queue.

      # This property associates this Lambda function with the SQS queue defined above, so that whenever the queue
      # receives a message, the Lambda function is invoked
      Events:
        SQSQueueEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt InvestigateIAMPermissionsQueue.Arn
            BatchSize: 10
            Enabled: true
            ScalingConfig:
              MaximumConcurrency: 2
      MemorySize: 128
      Timeout: 900
      Policies:
        - Statement:
            - Sid: Stmt1679505932243
              Effect: Allow
              Action:
                - sts:AssumeRole
              Resource:
                [
                  arn:aws:iam::XXXXXXXXXXXX:role/cross-account-access,
                  ...
                ]
            - Sid: Stmt1679505932905
              Effect: Allow
              Action:
                - secretsmanager:GetSecretValue
              Resource:
                - Fn::ImportValue:
                    Fn::Sub: "${AirtableSecretsStackName}-ARN"
    Metadata: # Manage esbuild properties
      BuildMethod: esbuild
      BuildProperties:
        Minify: false
        Target: "es2020"
        Sourcemap: true
        EntryPoints:
          - SQSIAMProcessor.ts

Outputs:
  InvestigateIAMPermissionsFunction:
    Description: "Investigate IAM Permissions Lambda Function ARN"
    Value: !GetAtt InvestigateIAMPermissionsFunction.Arn
  InvestigateIAMPermissionsFunctionIamRole:
    Description: "Implicit IAM Role created for Investigate IAM Permissions function"
    Value: !GetAtt InvestigateIAMPermissionsFunctionRole.Arn
  InvestigateIAMPermissionsQueue:
    Description: "The SQS queue to communicate the two lambda functions"
    Value: !Ref InvestigateIAMPermissionsQueue
  InvestigateIAMPermissionsDeadLetterQueue:
    Description: "The dead letter SQS queue to send messages that can't be processed"
    Value: !Ref InvestigateIAMPermissionsDeadLetterQueue

Final thoughts

In this journey of creating a robust microservice, we’ve uncovered the remarkable capabilities of AWS SAM. The ability to deploy, test, and maintain cloud-native applications at lightning speed is a game-changer. The power of AWS SAM, combined with your expertise, opens the door to endless possibilities for innovation and efficiency in your development process.

As you embark on your own AWS SAM adventures, remember that the cloud is your playground, and AWS SAM is your ultimate tool. Keep experimenting, keep building, and enjoy the speed and agility that AWS SAM brings to your development projects. Happy coding!

We’re hiring!

We’re looking for talented engineers in all areas to join our team and help us to build the future of broadcast and media production.

Scroll to Top