Group By and Sum in MongoDB

Kamran Ahmed
InstructorKamran Ahmed

Share this video with your friends

Send Tweet
Published 2 years ago
Updated 2 years ago

In this lesson, we learn how to group documents in a collection using $group operator similar to GROUP BY in SQL. We also look at $sum operator and see how we can use it to count the number of documents as well as get sum of some field in our grouped results.

Here is the list of operators that we cover in this lesson:

You can see the list of all the available operators in the MongoDB docs.

Kamran Ahmed: [0:00] Here in our tweets collection we have a separate document for each of the tweets from a specific user. If we look at the first tweet, it says from screen_name jsbot, if we look at the second tweet it is again from jsbot, if we look at the third tweet, it's from user.screen_name FCBarcelona.

[0:15] Let's say that we want to count the number of tweets from each of the users in this collection. For that we can use the aggregate. I will do db.tweets.aggregate.

[0:23] Our first line is going to be $group. $group is an object, with the first field as _id, which is the column on which you want to group.

[0:30] In our case we want to group on the user.screen_name, so I will do $user.screen_name, and the rest of the parameters of the project We want to count, so I will have another field called count, and we will make it using the sum of 1.

[0:44] If I run this query you will see that we are getting that documents, and in each of the document we have the user_name as id, which is this one, and in the count we are getting the number of tweets from this user. Our sum in this case is doing is whenever is finding the user.screen_name and keeps preparing a group, if it already has a group for that, it just keeps adding one into that. In this case, for LFC, whenever it finds a new user.ScreenName with LFC, it keeps adding one into the current field.

[1:08] For sum we can also give it any field of the document that we have. If you look at the tweets collection, we have a Favorite Count Field in each of the documents.

[1:15] Let's say that we want to get the sum of "Likes" that each of the account has got on their tweets. What I'm going to do now is I'm going to replace count with the Tweet Count, and I'm going to add another field called Favorite Count, which is going to be sum of the Field Favorite Count.

[1:28] If I run the [inaudible] , you will see that in each of the documents we are getting the Tweet Count and the Favorite Count. The Firefox now has one tweet, and Total Favorites that they have got is 14. If you look at Yahoo, they have 11 tweets and Total Favorite Count that they have is 21.

[1:41] Let's start over with users by the people who tweet the most. For that, I'm going to add another stage called Sort. We are going to sort by tweet count in the descending order.

[1:49] If I run the [inaudible] , you will see that GsBot has the most tweets, 226. The second one is 51, and the third one is 50.

[1:56] If you want to get the list of all the users who tweet less than anyone else, we can do +1, so this will be ascending order now. This one has one tweet. The second one has one tweet again, and so on.