SQL is dynamic enough to handle queries within queries. These inner queries are called subqueries and they can be used in many different sections of another query. In this video we will use subqueries within a where clause, a join statement, and as a column value.
select user_handle, sku, (select avg(quantity) from Purchases where user_handle = p.user_handle and sku = p.sku) from Purchases p group by user_handle, sku;
Instructor: [00:00] One of my personal favorite aspects of SQL is the ability to write queries within a query. This composability makes it easier to pull out Internet data all in one call rather than making multiple calls and getting data out in pieces.
[00:14] In the grouping and [inaudible] lesson within this course, we wanted to pull out the row that had the earliest create date. If we do man on the create date, and then pull out the first name, we get an error. Our database doesn't know how to handle the other columns while it's aggregating all the create date columns down to one.
[00:31] If we add FirstName to the GROUP BY, we see we get all three rows out, because now that FirstName is grouped by uniques first names, we'll get out the man per grouping of first names. Instead, what we want to do is use a subquery. For that, we'll write, "Select create date and FirstName from users."
[00:49] We'll also use a WHERE create date equals a subquery, selecting out the man create date from our users table. You see that this returns my name for first name and my create date. In order to use a subquery, it needs to be wrapped in parenthesis. Our subquery here returns one man create date from our users table.
[01:12] That's what we use to filter by, where our man from the inner query matches the create date from the outer query. With this filter matching on the correct row, we can just pull out all the columns then. Subqueries can be used in other commands such as INSERT, UPDATE, and DELETE.
[01:29] Let's update the FirstName column of the row that has the earliest create date to Danny. Here you'll see that we've changed from Tyler to Danny. Not only can subqueries be used as a way to filter, but we can use them as a table to join on.
[01:44] Let's write, "Select total on FirstName from users, US, inner join on a subquery." We'll do select count of user handles as total. User handle from purchases or GROUP BY user handle, close parenthesis, and then we'll say we're a user handles match.
[02:05] With this in place, we'll see that Danny had one purchase and Mary had two purchases from the purchases table. Let's walk through this again. The top-level query is going to give us one column that exists on users, and the other that we create from our subquery. Our subquery is getting a count of GROUP BY user handles.
[02:25] This count tells us how many rows are in each group user handle. We inner join on the user handle. We pull out from the subquery and the user handles that already exist on the users table. The returning results show us how many orders each person purchased.
[02:42] Another option is using subqueries as a column value. For example, I want to return the average order size along with all of my skews and user handles instead of my purchases table. I can add a subquery next to my other columns. When I run this, you'll see that I get all of my user handles, all of my skews, and a consistent average across all of the quantities within the purchases table.
[03:05] To clarify, this isn't stating that the average for this user for the skew is this value. This is just the average quantity for all of the purchases made inside of our purchases table. I will, though, post the solution for making this average be a reflection of this user handle and the skew in the notes.
[03:24] Before you look, see if you can do it yourself. You'll want to enter another row with the same user handle and skew and a different quantity to make sure it's working like it should.