Skip to content
Serverless Handbook
GitHub

Elements of serverless – lambdas, queues, gateways, and more

We mentioned lambdas, queues, and other elements of serverless in previous chapters – Architecture Principles and Serverless Flavors – but what are they?

Let me explain.

Lambda – a cloud function

"Lambda" comes from lambda calculus, a mathematical definition of functional programming Alonzo Church introduced in the 1930s. It is the alternative to Turing's turing machines for describing a system that can solve any solvable problem. Turing machines describe iterative step-by-step programming, lambda calculus describes functions-calling-functions functional programming.

Both approaches are equal in power.

No wonder that AWS decided to name their cloud functions AWS Lambda. As the platform's popularity grew, the word "lambda" morphed into a generic term for cloud functions and the core building block of serverless computing.

A lambda is a function. In this context, a function running as its own tiny server when triggered by an event.

Here's a lambda function (yes just like ATM Machine) that returns "Hello world" in response to an HTTP event.

// src/handler.ts
import { APIGatewayEvent } from "aws-lambda";
export const handler = async (event: APIGatewayEvent) => {
return {
statusCode: 200
body: "Hello world"
}
}

A TypeScript file exports a function called handler. The function accepts an event and returns a response. The AWS Lambda machinery takes care of the rest.

Because this is a user-facing API method, it accepts an AWS API Gateway event and returns an HTTP style response. Status code and body.

Other providers and services will have different events and expect different responses, but a lambda always follows this pattern 👉 function with an event and a return value.

Considerations with lambda functions

Your functions should follow functional programming practices:

  • idempotent – multiple invocations with the same inputs must produce the same result
  • pure – a function can't rely on anything it isn't fetching itself. Your environment does not persist, data in local memory can vanish any time. Rely on the arguments you're given and nothing else.
  • light on side-effects – you need some side-effects to make changes to your system. Make sure those come in the form of invoking other functions and services. State inside your lambda does not persist
  • do one thing and one thing only – small functions focused on a single task are easiest to understand and combine

Small functions work together to produce extraordinary results. Like this example of combining Twilio and AWS Lambda to answer the door.

Creating lambdas

In the open source Serverless Framework, you define lambda functions with serverless.yml like this:

functions:
helloworld:
handler: dist/helloworld.handler
events:
- http:
path: helloworld
method: GET
cors: true

Define a helloworld function and say it's handled by the handler method exported from the dist/helloworld file. Notice that we're running a build step for TypeScript so while the code is in src/, we run it from dist/.

events lists triggers to fire this function. An HTTP GET request on the path /helloworld for us.

Other common triggers include Queues, S3 changes, and CloudWatch events. At least in the AWS world.

Queue

Queue is short for message queue – a service built on top of queue, the data structure. Software engineers aren't that inventive with names and like to overload words 🤷‍♂️

You can think of the queue data structure as a list of items.

queue

enqueing adds items to the back of a queue, dequeing takes them out the front. Items in the middle wait their turn. Just like a lunch-time burrito queue at a food truck. First come first serve, or FIFO for short (first in first out).

A messaging queue takes this data structure and scales it into a service.

Different implementations of messaging queues exist and they all share these core properties:

  • persistent storage – queues have to be reliable so they store messages in a database. Redis is common for its speed, Postgres works, too. You could even write to a file. Most queues prioritize speed and use in-memory storage.
  • a worker process – once you have messages, you need to process them. A process can periodically check, if there's something new (polling). Another approach is triggering this check every time a message arrives.
  • a trigger API – when the queue sees a new message, it triggers your code. In serverless that means running your lambda, in traditional environments it's another worker process.
  • a retry/error policy – most queues help you deal with errors. When you fail to process a message, it goes back on the queue and gets retried later.

Many modern queues add time to the mix. You can schedule messages for later. 2 seconds, 2 minutes, 2 days, whatever you want. Some queues limit how long messages can stay.

Time is particularly useful when dealing with errors.

Server processes can fail for any reason at any time. The error is often temporary so queues use an exponential backoff when retrying. Giving your system more and more time to recover from issues.

Try to process a message. It fails. Try again 1 second later. Fails. Retry in 2 seconds. Fail. 4 seconds ...

I've seen corrupt messages, also known as poison pills, backing off as far as days into the future. We'll talk about dealing with unprocessable messages in the Robust Backend Design chapter.

Defining a queue

AWS SimpleQueueService is a great queue service I've found in the AWS serverless ecosystem. More powerful alternatives exist, but they require more setup, more upkeep, and have features you might not need.

Always use the simplest and smallest service that solves your problem until you know why you need something better ✌️

Using the serverless framework, you define an SQS queue in the resources section like this:

resources:
Resources:
MyQueue:
Type: "AWS::SQS::Queue"
Properties:
QueueName: "MyQueue-${self:provider.stage}"

A resource name MyQueue of the SQS Queue type with a queue name of MyQueue-{stage}. Adding the stage to your queue name allows you to deploy to different logical environments without collisions.

Make sure you use correct capitalization.

The second Resources must be capitalized, so do Type, Properties, and QueueName. That's because you're dropping a layer below Serverless and writing AWS's native Serverless Application Model (SAM) configuration. It's what the serverless framework compiles configuration to before deploying.

Processing a queue

You need a lambda to process messages on MyQueue.

functions:
myQueueProcess:
handler: dist/lambdas/myQueue.handler
events:
- sqs:
arn:
Fn::GetAtt:
- MyQueue
- Arn
batchSize: 1

A lambda triggered each time MyQueue has a new message to process. With a batchSize of 1, each message is processed on its own lambda – a good practice for initial implementations. More about batch sizes in the Lambda Workflows and Robust Backend Design chapters.

The weird yaml syntax reads as: an SQS event fired by a queue with the ARN identifier of getAttribute(Arn) from MyQueue. Amazon Resource Names, ARN, are unique identifiers for each resource in your AWS account.

That myQueue.handler lambda would look like this:

import { SQSEvent, SQSRecord } from "aws-lambda";
export const handler = async (event: SQSEvent) => {
// N depends on batchSize setting
const messages: string[] = event.Records.map(
(record: SQSRecord) => record.body
);
// do something with each message
// throw Error() on fail
return true;
};

An async function that accepts an SQSEvent, which contains multiple messages depending on batchSize. Never assume the total number because the setting might change without warning.

Extract messages into an array of strings and process them. Throw an error when something goes wrong so SQS can retry. Return true on success.

API Gateway

You might not realize this, but most servers don't talk directly to the internet.

Most servers are protected by a reverse proxy – a server specializing in taking raw internet requests and routing them to application servers.

api gateway

Requests come from the wild internet into the proxy. The proxy then decides:

  • is this a valid request?
  • will this request break stuff?
  • which type of server can handle this?
  • which instance of that server should do it?

Validating requests can involve everything from checking for denial of service attacks, verifying permissions, and even A/B testing or firewall protection.

Once verified the request is passed to an instance of your application server. I say instance because this is where horizontal scaling comes into play.

Amazon calls its reverse proxies API Gateways. Different providers might have different names and they all perform the same function: Take request from the internet and pass it on to your lambda. Often on a fresh new container for near infinite scale. 🤘

Static file storage – S3

The serverless world is ephemeral which means you can't save files just anywhere.

Adding them to your code makes spinning up new containers slow. Can't save locally either because your server goes away after each request.

Static file storage services solve that problem.

AWS S3 is the most common. Some of you might remember FTP. Most other services I've seen use S3 behind the scenes. Perhaps because it's really cheap to save files there.

You can think of S3 as a hard drive with an API. Read and write files, get their URL, stuff like that.

Each file gets a URL served by a server optimized for static files. No code, no nothing. Just a raw file flying through HTTP.

Static file server – CDN

cdn

CDNs – content delivery networks – are the next step in static file storage. They're like a distributed caching system.

Where S3 serves files from a central location, a CDN serves those same files from as close to the end user as possible. Reducing round-trip times and speeding up your website.

Configuration from your end goes something like this:

  1. Point the CDN to your static file URL
  2. The CDN gives you a new URL
  3. Use that URL in your client-side code

You can often automate this part with build tools. Netlify and Zeit both handle it for you.

Now when a browser requests a file, the URL resolves to the physically nearest server. Request goes to that server, if it finds the file, it serves that request. If there's no file, the CDN server fetches it from your original source, saves it locally, and then sends it back to the user.

And just like that JavaScript, HTML, images, CSS, and others are fast to load anywhere in the world. 👌

Logging

Logging is one of the hardest problems in a distributed multi-service world. You can't just print to the console or write to a local file.

Nobody's looking at the console and files vanish after every request.

So what do you do?

You use a logging service and send logs to an API that collects them in a central location. Often these logs are organized by service so they're easier to follow.

Implementing a logging service yourself can be tricky (I've done it, do not recommend) and the AWS ecosystem has you covered with CloudWatch. Mostly. The UI tools for reading logs aren't the best and there's power user stuff you can't do, but it's a great start.

Anything you console.log is collected in CloudWatch alongside some default logs.

Those are the important elements

Now you know the most important elements of your serverless ecosystem:

  • lambdas for doing things
  • queues for communicating
  • gateways for handling requests
  • S3 for static files
  • CDN for serving static files
  • logging to keep track

There's a bunch more to discover, but that's the core. Next chapter we're gonna look at storing your data.

Did you enjoy this chapter?

Thanks for supporting Serverless Handbook consider sharing it with friends

New chapters in your inbox every week or so ❤️

Cheers,
~Swizec

Edit this page on GitHub
Previous:
Architecture principles
Next:
Where to store data