Serverless Chrome puppeteer

Say you want to build a scraper, automate manual testing, or generate custom social cards for your website. What do you do?

You could spin up a docker container, set up headless Chrome, add Puppeteer, write a script to run it all, add a server to create an API, and ...

Or you can set up Serverless Chrome with AWS Lambda. Write a bit of code, hit deploy, and get a Chrome browser running on demand.

That's what this chapter is about 🤘

You'll learn how to:

configure Chrome Puppeteer on AWS
build a basic scraper
take website screenshots
run it on-demand

We build a scraper that goes to google.com, types in a phrase, and returns the first page of results. Then reuse the same code to return a screenshot.

You can see full code on GitHub

Serverless Chrome

Chrome's engine ships as the open source Chromium browser. Other browsers use it and add their own UI and custom features.

You can use the engine for browser automation – scraping, testing, screenshots, etc. When you need to render a website, Chromium is your friend.

This means:

download a chrome binary
set up an environment that makes it happy
run in headless mode
configure processes that talk to each other via complex sockets

Others have solved this problem for you.

Rather than figure it out yourself, I recommend using chrome-aws-lambda. It's the most up-to-date package for running Serverless Chrome.

Here's what you need for a Serverless Chrome setup:

install dependencies

$ yarn add chrome-aws-lambda@3.1.1 puppeteer@3.1.0 @types/puppeteer puppeteer-core@3.1.0

This installs everything you need to both run and interact with Chrome. ✌️

Check chrome-aws-lambda/README for the latest version of Chrome Puppeteer you can use. Make sure they match.

configure serverless.yml

# serverless.yml

service: serverless-chrome-example

provider:
  name: aws
  runtime: nodejs12.x
  stage: dev

package:
  exclude:
    - node_modules/puppeteer/.local-chromium/**

Configure a new service, make it run on AWS, use latest node.

The package part is important. It tells Serverless not to package the chromium binary with your code. AWS rejects builds of that size.

You are now ready to start running Chrome ✌️

Dive into modern backend. Understand any backend

Serverless Handbook shows you how with 360 pages for people like you getting into backend programming.

With digital + paperback content Serverless Handbook has been more than 1 year in development. Lessons learned from 14 years of building production grade websites and webapps.

With Serverless Handbook, Swiz teaches the truths of distributed systems – things will fail – but he also gives you insight on how to architect projects using reliability and resilience perspectives so you can monitor and recover.

~ Thai Wood, author of Resilience Roundup

If you want to understand backends, grok serverless, or just get a feel for modern backend development, this is the book for you.

Serverless Handbook full of color illustrations, code you can try, and insights you can learn. But it's not a cookbook and it's not a tutorial.

Yes, there's a couple tutorials to get you started, to show you how it fits together, but the focus is on high-level concepts.

Ideas, tactics, and mindsets that you need. Because every project is different.

The Serverless Handbook takes you from your very first cloud function to modern backend mastery. In the words of an early reader:

Serverless Handbook taught me high-leveled topics. I don't like recipe courses and these chapters helped me to feel like I'm not a total noob anymore.

The hand-drawn diagrams and high-leveled descriptions gave me the feeling that I don't have any critical "knowledge gaps" anymore.

~ Marek C, engineer

If you can JavaScript, you can backend.

Plus it looks great on your bookshelf 😉

Cheers,
~Swizec

Serverless Chrome puppeteer

Serverless Chrome

Unlock your free chapter!

Dive into modern backend. Understand any backend