⚠️ This lesson is retired and might contain outdated information.

Make a Bot That Analyzes Human Emotions in Photos with Google Cloud Vision API

Hannah Davis
InstructorHannah Davis
Share this video with your friends

Social Share Links

Send Tweet
Published 8 years ago
Updated 2 years ago

With this bot, we’ll find the number of faces in a photo that is tweeted at us, and respond back with what emotions the faces are expressing, using the Google Cloud Vision API.

The Google Cloud Vision API is worth exploring, and you'll need to create an account before this lesson: https://cloud.google.com/vision/

[00:00] In addition to Twit, we'll also need FS. We'll need request and we'll need access to Google's Cloud Vision API, and the syntax for that is var vision = require('@google-cloud/vision').

[00:22] We need to give it the two variables we received when setting up our Cloud Vision account. The first is the project ID which here is Twitterbot, and we need to give it the key file name for the key file that we downloaded from the Cloud Vision website.

[00:39] We're going to make a bot that tries to analyze the emotions in the faces of photos that people tweet to us. First we need to get our mentions, and to do that, we'll say var stream = bot .stream('statuses/filter') and we'll track our screen name botsWithHand. This will allow us to do something every time botsWithHand is mentioned.

[01:05] A few other tools we have available to us here are stream to on connecting, stream to on connected, and stream to on error. These can all help with troubleshooting our streams.

[01:22] What we really want for the next step is stream to on tweet. When we get mentioned in a tweet, the first thing we want to do is check if there's a photo attached to the tweet. We can check this by looking for tweet.entities.media. If that exists, the next thing we're going to want to do is download the photo.

[01:49] We'll need to pass it three things. First, the URL of the photo, which is tweet.entities.mediazero.media_URL. We'll also need the user name of the person who tweeted at us so that we can reply to them. The user is tweet.user.screen_name. Lastly, we'll need the tweet ID string so that we can reply to that tweet.

[02:17] Let's make the download photo function up here. Again, we'll need to pass it the URL, the name to reply to, and the tweet ID. This is similar to what we did in lesson seven. We first need the parameters, so we need the URL, and we need an endcoding, which is binary. We'll say request.get, pass it our parameters function error response body.

[02:54] We'll make a filename which will be photo, plus the current time, plus JPEG, and then we'll say fs.writeFile and pass it the filename, the body, binary, and our callback. Here we can logout that we downloaded the photo.

[03:20] Next is the fun part. We get to analyze photo. In this, we'll need the filename, the reply to name and our tweet ID. Let's write that here, analyze photo, filename, reply to name, tweet ID. This is where we're going to ping Google's Cloud Vision API. The syntax for this is vision.detectfaces, takes a file name and function error faces. Let's just print faces out. You can see all the cool stuff that this gives you.

[04:04] If we run this, and I go to Twitter and pass it a photo, this is a lot of cool stuff. We can see a return to one face. It gives us the rough angles of the face, the head and face boundaries. It gives us the boundaries of all these features, so 10 years I browse four-head lips, etc.

[04:25] It gives us whether the person's wearing headwear or not, whether the photo is blurred or underexposed. It gives us these four emotions -- joy, sorrow, anger, and surprise.

[04:37] We're just going to look at the emotions for now. We're going to make a list of all the emotions in a photo, and we're going to assume that some photos have more than one face. We'll say var all emotions, is a blank array for now. We'll say faces.forEach(function(face)) and we're going to make another function called extract face emotions.

[05:14] Our four emotions are joy, anger, sorrow, and surprise. This is just going to let us format the data a little bit differently. It's going to return an array of all the emotions that this face contains.

[05:35] We'll say return emotions.filter function emotion, and if face emotion is true, it will be included in the array. Back up here, this is now an array. We'll say .forEach function emotion, if all emotions.index of emotion is negative 1 meaning that emotion doesn't exist in all emotions yet, all emotions.push emotion.

[06:21] Now we have the emotions of a person in a photo. Next thing we'll do is post status, and this will take all emotions, the reply to name and the tweet ID. For our post status function, we're going to do the normal bot.post('statuses/update'). We're going to need to give it a status and the in reply to status ID parameter, which will be tweet ID, and then our callback.

[07:16] If there's an error, we'll log it out. Otherwise, we'll say bot has tweeted status. We need a status, and if you remember, all emotions is just an array of emotions. We probably want to format it a little bit differently for our status.

[07:37] Here, we'll say var status = format status, and we'll make a function that takes the array of emotions and the screen name to reply to. Down here, we'll make our function.

[07:55] Our bot is going to look at the photo and say, "looking happy" if the emotion's happy and "looking sad" if the emotion's sad. In order to do this, we need to reformat our emotions into adjectives. Let's write var reformat emotions. Joy will become happy, anger will become angry, surprise will become surprised, and sorrow will become sad.

[08:21] Below, we'll start our status by saying var status = @ plus the user name and then our bot is always going to say "looking plus the emotion." We can put "looking" here as well. First, we'll check to make sure that there's any emotions at all by looking at all emotions.length, and if there are emotions, we'll put our logic here.

[08:48] Otherwise, our status will be status plus neutral or looking neutral. If there are emotions, we'll say "all emotions.forEach emotion, and an iterator I, and we'll say, "If I is zero or if it's the first emotion, status will equal status plus the reformatted emotion."

[09:15] Otherwise, status will equal status plus and plus the reformatted emotion. If there are multiple emotions, our bot will say, "looking happy and surprised!" It will add an exclamation mark for good luck.

[09:33] Down here, we'll return the status. That's it. If we go to Twitter and if I submit a photo, we can see that it printed. We can see that our bot replied. You should know that this API isn't perfect, and in particular, it has a difficult time identifying sorrow, but it still opens a ton of possibilities for interesting bots.