Good question. 🤔

Introduction

I have been playing around with a YouTube clone I call FooTube. I had set up video uploads to be sent from the browser to an AWS S3 bucket, so the video file did not touch my node backend This made server-side video processing a non-starter. This put me in a dilemma because I wanted to generate 3 thumbnails for each video upload like the real YouTube does. I started thinking about creating a video player off-screen and using canvas to stream things around. While that might be possible, it didn't sound like fun, and thats not what I ended up doing.

The research began.

I discovered that YouTube uses deep neural networks to pick out thumbnails that display a subject or a face or something else that draws attention. They also capture a thumbnail for every second of video and use an algorithm to rank each one. This interesting article written by the YouTube Creator team from 2015 explains further. At this point I decided that just getting 3 thumbnail images would enough of a challenge for me - since I still had no clue what I was doing. 🤦‍♂️

Companion Video

Disclaimer

Please keep in mind this code is NOT meant to be a production ready solution, it is more an exploration or proof of concept. There are a lot of moving parts, and while I have managed to get this working in my local environment, I simply cannot guarantee it will work anywhere else! Sorry.

Lambda Functions

The first thing I found out was that I could use AWS Lambda to sort of outsource computations that might normally take place on a server. As a bonus, since I was already using S3, I could attach what amounts to an event listener to trigger my Lambda function when I uploaded a video file.

Creating a new Lambda function is straightforward. When prompted you want to chose create a function from scratch and come up with a decent name; createThumbail worked for me. Also, select the Node.js 8.10 runtime.

IAM Role Permissions

I had to create a new IAM role to execute this function. This can be done through a simple work flow in the IAM console. Name the role whatever you want but give it the AWSLambdaExecute permission. This will allow for PUT and GET access to S3 and full access to CloudWatch Logs. These are all the permissions we need to execute and monitor our createThumbnail Lambda function. I had to add the arn for this role to my bucket policy.

        {
            "Sid": "Stmt**************",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::**********:role/LambdaRole"
                ]
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::bucket/*"
        }

Triggers

Next we need to configure the trigger for our function. We want to listen to the bucket we are uploading videos to and watch for the PUT method since that is the method used to send the video. Optionally, you can set a prefix and/or suffix to narrow down the trigger. My function saves the thumbnails to this same bucket. In this case you might use a suffix of mp4 or webm (video formats). My videos were going to the user folder so I set a prefix of user/ since this would be at the beginning of any key.

Once your function is created and its trigger configured, these settings will show up in the S3 bucket referenced by said trigger. In fact they can be set from either S3 or Lambda consoles. Click the Properties tab then the Events box in the S3 console to view events associated with a bucket.

Getting Code to Lambda

There are a few ways to get code into our Lambda function. AWS provides a online code editor if your package size less than 3MB. You can also upload a package in the form of a zip file directly to Lambda or upload a zip file to S3 and then link that to your function. This zip format allows multiple files to be included in your bundle, including typical node_modules dependencies as well as executable files.

In fact, we are going to utilize a couple executable files to help process our video. ffmpeg is a command line tool to convert multimedia files and ffprobe is a stream analyzer. You might have these tools installed locally but we need to use static builds on Lambda. Download choices can be found here. I chose https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz. To unpack the compressed contents I used 7-Zip. Once unpacked we want to isolate the files ffmpeg and ffprobe, go figure.

Note that user,group and global all have read/execute permissions. I am on Windows and had a problem keeping these permissions. Lambda permissions are a little tricky, and global read is important for all files. On Windows the problem arose when I attempted the next step.

To get our executable files to Lambda we could but them into a directory with our index.js (the actual function script) then zip and upload that. There are a couple downsides to this. On Windows zipping the executable files in Windows Explorer stripped the permissions and caused errors when attempting to invoke the executable files my function. Also, every time I made a change in my script I had to re-upload a 40MB file. This is horribly slow and consumes data transfer credit. Not ideal for development and data transfer can cost 💲. The first part of the solution to this problem is to use a Lambda Layer.

Lambda Layers

A Lambda Layer can hold additional code in the form of libraries, custom runtimes or other dependencies. Once we establish a Layer it can be used in multiple functions and can be edited and saved in multiple versions. Very flexible.

First, we need to place our ffmpeg and ffprobe files into a folder called nodejs - the name is important. I ended up using Windows Subsystem for Linux and the zip command to compress the nodejs folder. This was the easiest way I found to preserve the proper permissions.

From the parent directory of our nodejs folder, I run:

zip -r ./layer.zip nodejs

The -r is to recursively zip the contents of nodejs into a new file called layer.zip.

From the Lambda console click the Layers tab and create a new layer. When you create your Layer make sure to set Node.js 8.10 as a compatable runtime. Now you can go back to the function configuration and add our new Layer to createThumbnail.

Finally, we get to the code. 😲

Disclaimer

If someone sees anything that could be better here please comment and let me know. It took me a while to cobble all these ideas together from various corners of the net and this is the first time I have used Lambda. What I'm saying is I am no expert, but finding an article like this when I started would have been helpful.

Code

Since we took the time to set up a Layer and our code has no other dependencies we can type our code directly into the inline editor. I made my local copy in VSCode just to have a my preferred editor settings, then copy and pasted.

First we need to require some of the stuff we need. The aws-sdk is available in the environment. child_process and fs are Node modules.

const AWS = require('aws-sdk')
const { spawnSync, spawn } = require('child_process')
const { createReadStream, createWriteStream } = require('fs')

spawn and spawnSync will allow us to run our executable files from within the Node environment as child processes.

The Lambda environment provides a /tmp directory to use as we wish. We will stream our image data from ffmpeg into /tmp and then read from there when we upload our thumbnails.

Now we can define some variables we will use later.

const s3 = new AWS.S3()
const ffprobePath = '/opt/nodejs/ffprobe'
const ffmpegPath = '/opt/nodejs/ffmpeg'
const allowedTypes = ['mov', 'mpg', 'mpeg', 'mp4', 'wmv', 'avi', 'webm']
const width = process.env.WIDTH
const height = process.env.HEIGHT

We create our S3 instance to interact with our bucket. Since we are using a Layer the paths to our executable files are located in the /opt/nodejs directory. We define an array of allowed types. Settings for width and height can be set as environment variables from the Lambda console. I used 200x112.

Our actual function is written in standard Node format and must be called handler. A custom name can be set in the console.

module.exports.handler = async (event, context) => {
  const srcKey = decodeURIComponent(event.Records[0].s3.object.key).replace(/\+/g, ' ')
  const bucket = event.Records[0].s3.bucket.name
  const target = s3.getSignedUrl('getObject', { Bucket: bucket, Key: srcKey, Expires: 1000 })
  let fileType = srcKey.match(/\.\w+$/)

  if (!fileType) {
    throw new Error(`invalid file type found for key: ${srcKey}`)
  }

  fileType = fileType[0].slice(1)

  if (allowedTypes.indexOf(fileType) === -1) {
    throw new Error(`filetype: ${fileType} is not an allowed type`)
  }

    // to be continued
}

We will make our function async so we can compose our asynchronous code in a way that appears synchronous. First we parse the srcKey from the event passed in from Lambda. This is the filename of our video without the bucket url. We also grab the bucket name. We can save our images to the same bucket as our video if we set our event listener up such that our function won't fire when they are uploaded. We then isolate the file extension and run some checks to make sure it is valid before continuing.

// inside handler function

  const ffprobe = spawnSync(ffprobePath, [
    '-v',
    'error',
    '-show_entries',
    'format=duration',
    '-of',
    'default=nw=1:nk=1',
    target
  ])

  const duration = Math.ceil(ffprobe.stdout.toString())

Here we use spawnSync to run ffprobe and get the duration of the video from the stdout. Use toString because the output is Buffered. By having the duration we can capture our thumbnails in a targeted way throughout the video. I thought taking a thumbnail at 25%, 50% and 75% was a reasonable way to go about getting 3. Of course, with the following functions you can take as many thumbnails as needed. ffprobe can also report much more data than duration, but that is all we are concerned with here.

  function createImage(seek) {
    return new Promise((resolve, reject) => {
      let tmpFile = createWriteStream(`/tmp/screenshot.jpg`)
      const ffmpeg = spawn(ffmpegPath, [
        '-ss',
        seek,     
        '-i',
        target,   
        '-vf',
        `thumbnail,scale=${width}:${height}`,
        '-qscale:v',
        '2',
        '-frames:v',
        '1',
        '-f',
        'image2',
        '-c:v',
        'mjpeg',
        'pipe:1'  
      ])

      ffmpeg.stdout.pipe(tmpFile)

      ffmpeg.on('close', function(code) {
        tmpFile.end()
        resolve()
      })

      ffmpeg.on('error', function(err) {
        console.log(err)
        reject()
      })
    })
  }

There is a lot going on here. The function takes a seek parameter. With this in place we can enter Math.round(duration * .25) for example. The -ss flag followed by time in seconds will seek the video to this spot before taking our thumbnail. We reference target which is our video file. We specify the dimensions we want to use, the quality, frames and format, then finally we pipe the output into a writeStream that is writing to the /tmp directory. All of this is wrapped in a Promise that resolves when this child_process closes.

Understanding exactly what each ffmpeg input does is mad confusing but the ffmpeg Documentation is decent and there are a lot of forum posts out there as well. The bottom line is we have a reusable function that lets us take a thumbnail whenever we want. It also works well in our async/await flow.

  function uploadToS3(x) {
    return new Promise((resolve, reject) => {
      let tmpFile = createReadStream(`/tmp/screenshot.jpg`)
      let dstKey = srcKey.replace(/\.\w+$/, `-${x}.jpg`).replace('/videos/', '/thumbnails/')

      var params = {
        Bucket: bucket,
        Key: dstKey,
        Body: tmpFile,
        ContentType: `image/jpg`
      }

      s3.upload(params, function(err, data) {
        if (err) {
          console.log(err)
          reject()
        }
        console.log(`successful upload to ${bucket}/${dstKey}`)
        resolve()
      })
    })
  }

Now we write a reusable function that will upload thumbnail images to an S3 bucket. Since I used prefix and suffix filters and I am uploading video files to /user/videos I can just replace videos with thumbnails and my function won't be triggered. You can put in any dstKey and bucket that you want. Again we are wrapping our function in a Promise to help with our async flow.

So our final code might look something like this:

process.env.PATH = process.env.PATH + ':' + process.env['LAMBDA_TASK_ROOT']

const AWS = require('aws-sdk')
const { spawn, spawnSync } = require('child_process')
const { createReadStream, createWriteStream } = require('fs')

const s3 = new AWS.S3()
const ffprobePath = '/opt/nodejs/ffprobe'
const ffmpegPath = '/opt/nodejs/ffmpeg'
const allowedTypes = ['mov', 'mpg', 'mpeg', 'mp4', 'wmv', 'avi', 'webm']
const width = process.env.WIDTH
const height = process.env.HEIGHT
}

module.exports.handler = async (event, context) => {
  const srcKey = decodeURIComponent(event.Records[0].s3.object.key).replace(/\+/g, ' ')
  const bucket = event.Records[0].s3.bucket.name
  const target = s3.getSignedUrl('getObject', { Bucket: bucket, Key: srcKey, Expires: 1000 })
  let fileType = srcKey.match(/\.\w+$/)

  if (!fileType) {
    throw new Error(`invalid file type found for key: ${srcKey}`)
  }

  fileType = fileType[0].slice(1)

  if (allowedTypes.indexOf(fileType) === -1) {
    throw new Error(`filetype: ${fileType} is not an allowed type`)
  }

  function createImage(seek) {
    return new Promise((resolve, reject) => {
      let tmpFile = createWriteStream(`/tmp/screenshot.jpg`)
      const ffmpeg = spawn(ffmpegPath, [
        '-ss',
        seek,
        '-i',
        target,
        '-vf',
        `thumbnail,scale=${width}:${height}`,
        '-qscale:v',
        '2',
        '-frames:v',
        '1',
        '-f',
        'image2',
        '-c:v',
        'mjpeg',
        'pipe:1'
      ])

      ffmpeg.stdout.pipe(tmpFile)

      ffmpeg.on('close', function(code) {
        tmpFile.end()
        resolve()
      })

      ffmpeg.on('error', function(err) {
        console.log(err)
        reject()
      })
    })
  }

  function uploadToS3(x) {
    return new Promise((resolve, reject) => {
      let tmpFile = createReadStream(`/tmp/screenshot.jpg`)
      let dstKey = srcKey.replace(/\.\w+$/, `-${x}.jpg`).replace('/videos/', '/thumbnails/')

      var params = {
        Bucket: bucket,
        Key: dstKey,
        Body: tmpFile,
        ContentType: `image/jpg`
      }

      s3.upload(params, function(err, data) {
        if (err) {
          console.log(err)
          reject()
        }
        console.log(`successful upload to ${bucket}/${dstKey}`)
        resolve()
      })
    })
  }

  const ffprobe = spawnSync(ffprobePath, [
    '-v',
    'error',
    '-show_entries',
    'format=duration',
    '-of',
    'default=nw=1:nk=1',
    target
  ])

  const duration = Math.ceil(ffprobe.stdout.toString())

  await createImage(duration * 0.25)
  await uploadToS3(1)
  await createImage(duration * .5)
  await uploadToS3(2)
  await createImage(duration * .75)
  await uploadToS3(3)

  return console.log(`processed ${bucket}/${srcKey} successfully`)
}

Tips

Lambda allows you to allocate a set amount of memory to your function. I am using 512MB and everything seems to be running well. My function is doing a couple more things that described here and uses around 400MB per invocation.
Utilize the CloudWatch logs and the monitoring graphs provided by AWS. My function averages about 12 seconds per invocation. Note that I have a ton of errors on this graph as I attempted to refactor things(all the green dots at the bottom).

This version of the code has no contact with the application from which the original video is uploaded. Solutions to this are to send a POST request from the Lambda function to your backend when the processing is complete. Another option I found is that adding 20 seconds delay to my video upload gives ample time for the thumbnails to be created. When uploading the video we know where its going so we know the url it will eventually have. Since we are building our thumbnail keys based on the original video key we know what those urls will be as well.

const videoUrl = 'https://s3-us-west-1.amazonaws.com/footube/user/videos/example.mp4'

const imageUrl = 'https://s3-us-west-1.amazonaws.com/footube/user/thumbnails/example-1.jpg'

Notice that I allow an extra 20 seconds for processing before I show the thumbnails.

ffmpeg can do much more. It can convert formats. It can even generate a preview GIF like what you see on YouTube when you hover a video thumbnail.

Resources

Articles I found helpful.

Conclusion

This article ended up way longer that I thought it would. I wanted to give a comprehensive view of how to set this thing up. If I left something out or got something wrong please let me know.

How do I create thumbnails when I upload a video? aws lambda!

Introduction

Companion Video

Disclaimer

Lambda Functions

IAM Role Permissions

Triggers

Getting Code to Lambda

Lambda Layers

Disclaimer

Code

Tips

Resources

Conclusion