If you feel like skipping the brief introduction below, you can jump straight to the first four trigger with these shortlinks:
- Amazon Cognito User Pools — Users management & custom workflows
- AWS Config — Event-driven configuration checks
- Amazon Kinesis Data Firehose — Data ingestion & validation
- AWS CloudFormation — IaC, Macros & custom transforms
A bit of history first
When AWS Lambda became generally available on April 9th, 2015 it became the first Function-as-a-Service out there, and there were only a few ways you could trigger your functions besides direct invocation: Amazon S3, Amazon Kinesis, and Amazon SNS. Three months later we got Amazon API Gateway support, which opened a whole new wave for the web and REST-compatible clients.
By the end of 2015, you could already trigger functions via Amazon DynamoDB Streams, Kinesis Streams, S3 objects, SNS topics, and CloudWatch Events (scheduled invocations).
Personally, I started experimenting with AWS Lambda around early 2016 for a simple machine learning use case. A few months later I published the very first video about my experience with Lambda, which covered all the available triggers and configurations available at the time; well, the video is still available here, but the AWS Console is pretty different now so I’d recommend you watch it only if you are feeling nostalgic =)
Back to history…
In the following months, AWS Lambda became very popular and many other AWS services started integrating it and allowing you to trigger functions in many new ways. These integrations are fantastic for processing/validating data, as well as for customizing and extending the behavior of these services.
You may be already aware of (or intuitively guess) how AWS Lambda integrates with services such as S3, DynamoDB, Kinesis Data Streams, SES, SQS, IoT Core, Step Functions, and ALB. And there are plenty of articles and getting-started guides out there using these integrations as a good starting point for your serverless journey.
In this article, I’d like to share with you some of the many other less common, less well-known, or even just newer ways to invoke your Lambda functions on AWS. Some of these integrations do not even appear on the official Supported Event Sources documentation page yet and I believe they are worth mentioning and experimenting with.
For each service/integration, I will share useful links, code snippets, and CloudFormation templates & references. Please feel free to add a comment below if you think something’s missing or if you need more resources/details. Even if you don’t know Python or JavaScript, the code will be pretty self-explanatory and with useful comments. Please drop a comment on Gist or at the bottom of this article if you have questions or doubts.
Let’s get started with the first 4 triggers for AWS Lambda.
1. Amazon Cognito User Pools (custom workflows)
Cognito User Pools allow you to add authentication and user management to your applications. With AWS Lambda, you can customize your User Pool Workflows and trigger your functions during Cognito’s operations in order to customize your User Pool behavior.
Here’s the list of available triggers:
- Pre Sign-up — triggered just before Cognito signs up a new user (or admin) and allows you to perform custom validation to accept/deny it
- Post Confirmation — triggered after a new user (or admin) signs up and allows you to send custom messages or to add custom logic
- Pre Authentication — triggered when a user attempts to sign in and allows custom validation to accept/deny it
- Post Authentication — triggered after signing in a user and allows you to add custom logic after authentication
- Custom Authentication — triggered to define, create, and verify custom challenges when you use the custom authentication flow
- Pre Token Generation — triggered before every token generation and allows you to customize identity token claims (for example, new passwords and refresh tokens)
- Migrate User — triggered when a user does not exist in the user pool at the time of sign-in with a password or in the forgot-password flow
- Custom Message — triggered before sending an email, phone verification message, or a MFA code and allows you to customize the message
All these triggers allow you to implement state-less logic and personalize how Cognito User Pools work using your favorite programming language. Keep in mind that your functions are invoked synchronously and will need to complete within 5 seconds, simply by returning the incoming event object with an additional response attribute.
It might be convenient to handle multiple events from the same Lambda Function as Cognito will always provide an attribute named event.triggerSource to help you implement the right logic for each event.
For example, here’s how you’d implement the Lambda function code for a Custom Message in Node.js:
As you can see, the logic is completely stateless and it’s considered best practice to always check the triggerSource value to make sure you are processing the correct event — and eventually raise an error/warning in case of unhandled sources.
The following code snippet shows how you can define the Lambda function and Cognito User Pool in a CloudFormation template (here I’m using AWS SAM syntax, but you could also use plain CloudFormation):
All you need to do is adding a LambdaConfig property to your User Pool definition and reference a Lambda function.
You can find all the attributes of LambdaConfig on the documentation page.
2. AWS Config (event-driven configuration checks)
AWS Config allows you to keep track of how the configurations of your AWS resources change over time. It’s particularly useful for recording historical values and it also allows you to compare historical configurations with desired configurations. For example, you could use AWS Config to make sure all the EC2 instances launched in your account are t2.micro.
As a developer, the interesting part is that you can implement this kind of compliance checks with AWS Lambda. In other words, you can define a custom rule and associate it with a Lambda function that will be invoked in response to each and every configuration change (or periodically).
Also, your code can decide whether the new configuration is valid or not :)
Of course, you don’t have to listen to every possible configuration change of all your resources. Indeed, you can listen to specific resources based on:
- Tags (for example, resources with an environment or project-specific tag)
- Resource Type (for example, only AWS::EC2::Instance)
- Resource Type + Identifier (for example, a specific EC2 Instance ARN)
- All changes
There are many AWS Lambda blueprints that allow you to get started quickly without coding everything yourself (for example, config-rule-change-triggered). But I think it’s important to understand the overall logic and moving parts, so in the next few paragraphs we will dive deep and learn how to write a new Lambda function from scratch.
Practically speaking, your function will receive four very important pieces of information as part of the input event:
- invokingEvent represents the configuration change that triggered this Lambda invocation; it contains a field named messageType which tells you if the current payload is related to a periodic scheduled invocation (ScheduledNotification), if it’s a regular configuration change (ConfigurationItemChangeNotification) or if the change content was too large to be included in the Lambda event payload (OversizedConfigurationItemChangeNotification); in the first case, invokingEvent will also contain a field named configurationItem with the current configuration, while in the other cases we will need to fetch the current configuration via the AWS Config History API
- ruleParameters is the set of key/value pairs that you optionally define when you create a custom rule; they represent the (un)desired status of your configurations (for example, desiredInstanceType=t2.small) and you can use its values however you want; let’s say this is a smart way to parametrize your Lambda function code and reuse it with multiple rules
- resultToken is the token we will use when to notify AWS Config about the config evaluation results (see the three possible outcomes below)
- eventLeftScope tells you whether the AWS resource to be evaluated has been removed from the rule’s scope, in which case we will just skip the evaluation
Based on the inputs above, our lambda function will evaluate the configuration compliance and it will be able to invoke the PutEvaluations API with three possible results:
- COMPLIANT if the current configuration is OK
- NON_COMPLIANT if the current configuration is NOT OK
- NOT_APPLICABLE if this configuration change can be ignored
Ok, enough theory :)
Let’s write some code and see AWS Config in action.
For example, let’s implement a custom rule to check that all EC2 instances launched in our account are t2.small using Node.js:
In the code snippet above, I am importing a simple utility module (that you can find here) to make the overall logic more readable.
Most of the magic happens in the JavaScript function named evaluateChangeNotificationCompliance. Its logic is parametrized based on ruleParameters and the value of desiredInstanceType — that we will define in a CloudFormation template below — so that we can reuse the same Lambda function for different rules.
Now, let’s define our AWS Config custom rule and Lambda function in CloudFormation:
Defining a custom rule is fairly intuitive. In the Scope property I am selecting only AWS::EC2::Instance resources and I am passing t2.small as an input parameter of the custom rule. Then, I define the Source property and reference my Lambda function.
You can find the full documentation about AWS Config custom rules here, with good references for scheduled rules, tags filtering, etc.
3. Amazon Kinesis Data Firehose (data validation)
Kinesis Data Firehose allows you to ingest streaming data into standard destinations for analytics purposes such as Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk.
You can have multiple data producers that will PutRecords into your delivery stream. Kinesis Firehose will take care of buffering, compressing, encrypting, and optionally even reshaping and optimizing your data for query performance (for example, in Parquet columnar format).
Additionally, you can attach a Lambda function to the delivery stream. This function will be able to validate, manipulate, or enrich incoming records before Kinesis Firehose proceeds.
Your Lambda function will receive a batch of records and will need to return the same list of records with an additional result field, whose value can be one of the following:
- Ok if the record was successfully processed/validated
- Dropped if the record doesn’t need to be stored (Firehose will just skip it)
- ProcessingFailed if the record is not valid or something went wrong during its processing/manipulation
Let’s now implement a generic and reusable validation & manipulation logic in Python:
The code snippet above is structured so that you only need to implement your own transform_data logic. There you can add new fields, manipulate existing ones, or decide to skip/drop the current record by raising a DroppedRecordException.
A few implementation details worth mentioning:
- Both incoming and outgoing records must be base64-encoded (the snippet above already takes care of it)
- I am assuming the incoming records are in JSON format, but you may as well ingest CSV data or even your own custom format; just make sure you (de)serialize records properly, as Kinesis Firehose always expects to work with plain strings
- I am adding a trailing \n character after each encoded record so that Kinesis Firehose will serialize one JSON object per line in the delivery destination (this is required for Amazon S3 and Athena to work correctly)
Of course, you can implement your own data manipulation logic in any programming language supported by AWS Lambda and — in some more advanced use cases — you may need to fetch additional data from Amazon DynamoDB or other data sources.
Let’s now define our data ingestion application in CloudFormation.
You can attach a Lambda function to a Kinesis Firehose delivery stream by defining the ProcessingConfiguration attribute.
In addition to that, let’s setup Firehose to deliver the incoming records to Amazon S3 every 60 seconds (or as soon as 10MB are collected ), compressed with GZIP. We’ll also need an ad-hoc IAM Role to define fine-grained permissions for Firehose to invoke our Lambda and write into S3.
Here is the full CloudFormation template for your reference:
The best part of this architecture in my opinion is that it’s 100% serverless and you won’t be charged if no data is being ingested. So it allows you to have multiple 24x7 environments for development and testing at virtually no cost.
You can find the complete CloudFormation documentation here. Plus, you’ll also find an end-to-end pipeline including Amazon API Gateway and Amazon Athena here.
4. AWS CloudFormation (Macros)
We have already seen many CloudFormation templates so far in this article. That’s how you define your applications and resources in a JSON or YAML template. CloudFormation allows you to deploy the same stack to multiple AWS accounts, regions, or environments such as dev and prod.
A few months ago — in September 2018 — AWS announced a new CloudFormation feature called Macros.
CloudFormation comes with built-in transforms such as AWS::Include and AWS::Serverless that simplify template authoring by condensing resource definition expressions and enabling components reusing. These transforms are applied at deployment-time to your CloudFormation templates.
Similarly, a CloudFormation Macro is a custom transform backed by your own Lambda Function.
There are three main steps to create and use a macro:
- Create a Lambda function that will process the raw template
- Define a resource of type AWS::CloudFormation::Macro (resource reference here), map it to the Lambda function above, and deploy the stack
- Use the Macro in a CloudFormation template
Macros are particularly powerful because you can apply them either to the whole CloudFormation template — using the Transform property — or only to a sub-section — using the intrinsic Fn::Transform function, optionally with parameters.
For example, you may define a macro that will expand a simple resource MyCompany::StaticWebsite into a proper set of resources and corresponding defaults, including S3 buckets, CloudFront distributions, IAM roles, CloudWatch alarms, etc.
It’s also useful to remember that you can use macros only in the account in which they were created and that macro names must be unique within a given account. If you enable cross-account access to your processing function, you can define the same macro in multiple accounts for easier reuse.
How to implement a CloudFormation Macro
Let’s now focus on the implementation details of the Lambda function performing the template processing.
When your function is invoked, it’ll receive the following as input:
- region is the region in which the macro resides
- accountID is the account ID of the account invoking this function
- fragment is the portion of the template available for processing (could be the whole template or only a sub-section of it) in JSON format, including siblings
- params is available only if you are processing a sub-section of the template and it contains the custom parameters provided by the target stack (not evaluated)
- templateParameterValues contains the template parameters of the target stack (already evaluated)
- requestId is the ID of the current function invocation (used only to match the response)
Once the processing logic is completed, the Lambda function will need to return the following three attributes:
- requestId must match the same request ID provided as input
- status should be set to the string "success" (anything else will be treated as a processing failure)
- fragment is the processed template, including siblings
It’s interesting to note that in some cases the processedfragment will be the same fragment you receive as input.
I can think of four possible manipulation/processing scenarios:
- Your function processes some resources and customizes their properties (without adding or removing other resources)
- Your function extends the input fragment by creating new resources
- Your function replaces some of the resources — potentially your own custom types — with other real CloudFormation resources (note: this is what AWS SAM does too!)
- Your function does not alter the input fragment, but intentionally fails if something is wrong or missing (for example, if encryption is disabled or if granted permissions are too open)
Of course, your macros could be a mix of the four scenarios below.
In my opinion, scenario (4) is particularly powerful because it allows you to implement custom configuration checks before the resources are actually deployed and provisioned , with respect to the AWS Config solution we’ve discussed at the beginning of this article.
Scenario (3) is probably the most commonly used, as it allows you to define your own personalized resources such as MyCompany::StaticWebsite (with S3 buckets, CloudFront distributions, or Amplify Console apps) or MyCompany::DynamoDB::Table (with enabled autoscaling, on-demand capacity, or even a complex shared configuration for primary key and indexes), etc.
Some of the more complex macros make use of a mix of stateless processing and CloudFormation Custom Resources backed by an additional Lambda function.
Here you can find real-world implementation examples of CloudFormation Macros, the corresponding macro templates, and a few sample templates too. I am quite sure you will enjoy the following macros in particular: AWS::S3::Object, Count, StackMetrics, StringFunctions, and more!
How to deploy a CloudFormation Macro
Once you’ve implemented the processing function, you can use it to deploy a new macro.
Here is how you define a new macro resource:
That’s it!
AWS CloudFormation will invoke the processing function every time we reference the macro named MyUniqueMacroName in a CloudFormation template.
How to use a CloudFormation Macro
Using a macro is the most likely scenario for most developers.
It’s quite common that macros are owned and managed by your organization or by another team, and that you’ll just use/reference a macro in your CloudFormation templates.
Here is how you can use the macro defined above and apply it to the whole template:
In case you’d like to apply the same macro only to a sub-section of your template, you can do so by using the Fn::Transform intrinsic function:
Let me know what CloudFormation Macros you’ll build and what challenges they solve for your team!
Conclusions
That’s all for Part 1 :)
I hope you have learned something new about Amazon Cognito, AWS Config, Amazon Kinesis Data Firehose, and Amazon CloudFormation.
You can now customize your Cognito User Pools workflow, validate your configurations in real-time, manipulate and validate data before Kinesis delivers it to the destination, and implement macros to enrich your CloudFormation templates.
In the next two parts of this series, we will learn more about other less common Lambda integrations for services such as AWS IoT 1-Click, Amazon Lex, Amazon CloudWatch Logs, AWS CodeDeploy, and Amazon Aurora.
Thank you for taking the time to read such a long article.
Feel free to share and/or drop a comment below.
Originally published on HackerNoon on Apr 2, 2019.