Understand the Basics of SQL and Relational Databases

Share this video with your friends

Social Share Links

Send Tweet
Published 4 years ago
Updated 3 years ago

To effectively use Supabase, you need to understand the basics of SQL, or Structured Query Language, and how it can be used to work with relational databases. In this talk, we'll look at the basics of SQL — how to query for data, how to insert and remove data, and how structured databases are organized using tables, rows, and columns.

Kristian Freeman: [0:00] Before we get too much farther into Supabase and understanding how things work when it comes to actually querying and updating our data, we need a brief introduction to SQL.

[0:10] What is SQL? Well, it's the Structured Query Language. It's used to give us a consistent way to query and modify our database. SQL operates on what are called relational databases. Relational databases describe how all the data in the database should look as well as how it relates to each other.

[0:30] Importantly, it is very, very descriptive. Sometimes, you might even feel like it's frankly a little bit too much. It's too descriptive, but it's that descriptiveness and it's, of course, talking about how the different pieces of data relate to each other that really gives it its power.

[0:47] Just to give you an example, let's break down what a user might look like in a relational database. A user may have three different fields -- ID, Created At, and Name.

[0:59] Not only is ID a field but we say, specifically, it is an int4 or integer type. Created At isn't a random field called Created At. It is specifically a timestamp. Name, obviously, a field that says my name is Kristian Freeman or something like that, that is specifically this text.

[1:20] Not only are we saying, "Hey, there is this data inside of a user," we also say the type -- this is int4, this is timestamp, this is text. We can even be more descriptive because relational databases, such as postgres and other tools, allow us to be more specific with what we say about those fields and how they work.

[1:42] ID field is an int4 integer and ID increments. For instance, we have user number one, which is our first user, user number two, which is our second user, etc. Not only is this field ID and integer, but we also can say it does things like increment.

[2:00] Created At is a timestamp, of course, but we can also say things like, and Created At defaults to Now. This is a function that says when we create a new user, Created At should be set to whatever the timestamp is right now.

[2:15] Finally, Name is a text field, but we can also say that Name can't be blank. When we create a new user, the Name field needs to be set to an actual value. We can't leave it blank.

[2:27] You might be asking, "How does this relate to NoSQL?" Relational database, that is what we're calling a SQL database and that deals with what we call rows and columns. We'll talk about that here in a bit.

[2:40] NoSQL, which would be things like DynamoDB or MongoDB. These are what we call NoSQL databases. Particularly, they are what we call a document database. Instead of having rows and columns, which again, we'll talk about here in a sec, NoSQL databases tend to be a little bit more unstructured. That makes it easy to iterate quickly on things.

[3:03] It also means that you lose some of that rigidity, that can make relational databases really powerful. For different use cases, whether you pick a relational database, or NoSQL database, they both have their pros and cons. It's worth noting, they are very different. You can go explore things like Mongo and DynamoDB if you want to learn more about how those tools work.

[3:24] Let's talk about tables, rows and columns. These are the building blocks of all relational databases. Inside of your Postgres database, the first high-level thing that you have, or I should say, in any of your SQL databases at large, not just Postgres, the first thing you have is what's called a table.

[3:43] Tables represent a type of data. For instance, if I'm storing users, or I'm storing messages, all of those users or all of those messages get stored inside of a table. It represent a single type of data, users messages, think of some other examples, notifications, channels, you know, replies. Basically, any high-level piece of data gets stored inside of a single table.

[4:12] Rows represent an object in that table. For instance, my specific user account or the last message that I wrote in Slack or Discord. If a table represents all of the users or all of messages, or whatever else, a row represents one specific instance, or object in that table.

[4:33] A column represents a specific value inside of that row. If we go back and look at our user example, which has an ID, Created At, and Name, each one of these is a specific column. A user has an ID, it has Created At, and it has a Name. These are all columns.

[4:55] Again, we have table, row and column. To just bring that to our user example, our table here is users. A row is one instance of user. Then, our columns are ID, Created At and Name.

[5:11] Now that we know the basics of how a relational database is laid out, let's start looking at what it means to actually write SQL. How do we actually do things with that data? The first thing we'll look at is querying data. Primarily querying data happens via the SELECT command.

[5:27] SELECT allows you to, as you might guess, select data from a table. The way that you do this is you say, select and then columns, which we'll look at here in a sec what those columns could be, from table.

[5:40] You'll notice that the way that I've written this is SELECT is capitalized, and then columns which is a placeholder here. I'll show you in a second. Then from all of this is kind of subjective. You can write SQL in all capital letters. You can write it lowercase letters.

[5:54] The SELECT here can be whatever case you want. Though the columns and table are not case insensitive. They have to match whatever you give say, what you call ID or what you call your table users. Just be aware of that.

[6:09] Of course, I'll show you some more examples here so you can start to wrap your head around it. Though importantly, SQL statements do end in a semicolon, so make sure you do that. If you try and run this without a semicolon, it will assume that you have more SQL to write, like you're doing a new line or something like that. Just be aware of that.

[6:26] A trick here is that if you don't know what columns you want, which again, I'll show you an example here in a sec, you can use the asterisk or star to select all columns. What that looks like is say, this SQL statement right here. Select star from public.user, that's my table here.

[6:43] It's just this public.user, public being this thing called a schema, which is a name space you can think of that we'll look at as we get further through Supabase's data management. What I'm saying here is select all columns, that's this asterisk here, from the table public.user.

[7:02] If I run this in my postgres CLI, which we'll talk a little bit about later, you can see that I get two values back here. My ID which is a UUID, which is a unique text string that represents the ID for my user, as well as the username, which is just my first user.

[7:19] What we've done here is we've looked at the public.user table and we've said, get all of the columns. Get ID and username, which are the two columns that we have set up on my user table, and actually make them output to the screen.

[7:37] This, as you might imagine, is our row, right? This is an example of a user. This is actually coming, by the way, from our Supabase example, which we'll look at here as we go through the rest of the course.

[7:47] Just be aware that this is what your data will actually look like just to start to get you comfortable with that. This is our row. This is our instance of a user and this is our column ID and a second column username.

[8:01] We're saying select all of the rows from public user, getting all of the columns here. Just to show you, you can select all of the columns in a table. You can also just select say, one.

[8:16] If I say, select username from public.user, we're still saying, go get all of the rows from the public.user table, but just get the column username. We can be more specific there. You can see that the output is a lot smaller. In this case, we don't have the ID coming through.

[8:33] Next up is inserting data. We've looked at how to query data that's already existing in the database. How do we actually insert it? Well, as you might guess, that's done with the INSERT command, which allows you to insert a row into a table.

[8:49] An example of this you can say, insert into and then give it a table name. We'll look at that in a sec, but there's two important things here. In parentheses here, we put a list of columns. Column one. Column two.

[9:04] Then we say, values followed by another set of our like a list inside of parentheses where this is value one and value two. Let's look at an actual example of that. I can say insert into public.user. That is my table name.

[9:22] Then in parentheses here, I have the columns that I'm inserting, ID and username. Then down here I've wrapped this just so you can read it a little bit better. I have values. I said ID and username are the columns that I'm going to be inserting.

[9:38] Here are the associated values where this one is the ID. It's in the same order here, so ID. Then this is that value followed by username, which is just my username on Twitter and things like that, which is signalnerve.

[9:54] Just to walk through it again, we say insert into a table. Our table here is public.user. Then the columns that we're going to be inserting, ID and Username. Then here, I can put the associated values for those columns, which is this UUID, as well as a username, signalnerve. If I run that in postgres, I get this insert -1 back.

[10:19] That's just saying that it was successfully able to insert a row. If I were to go back, for instance, and run deselect here, you would imagine that would get back some output that says username, my first user. Then I'd have a second row that says signalnerve.

[10:33] As we begin to insert more things into our database, it becomes time to learn how to filter data. We have all of this stuff getting put into our database and sometimes you want to just look for specific things. To do that, we are going to learn about the WHERE clause.

[10:49] This allows us to filter data based on criteria. Just to see what this looks like, I can say, select star from public.user. We've seen this part. Up until now, it just says, get all of the columns for all of the rows inside of public.user.

[11:06] Now we have this additional WHERE clause, where we can add a specific filter of some kind. I can say, where username is equal to signalnerve.

[11:14] One thing you might not be used to, if you haven't worked with SQL before, is you might look at this and say, specifically username equals signalnerve and say, "Whoa! Aren't you assigning a value here? You're saying username is equal to signalnerve."

[11:28] This is a SQL syntax thing. A single equals here is the same as a double equals in JavaScript. I'm saying look for any rows here in public.user where the username is equal to signalnerve. Just to show you, I can run that here.

[11:45] Select star from public.user where username equals signalnerve, and I get one row back here. ID, which is this UUID, and then Username, which matches signalnerve. That is our WHERE clause here that we added to allow us to filter through data.

[12:03] Just to show you what that looks like, well, without the WHERE clause, of course, select star from public.user. We get both users back here. This is my first user which was already in the database, and then signalnerve which is the thing that we inserted more recently.

[12:17] The WHERE clause allows us to filter out that my first user and just look for rows that match anything inside of this WHERE clause. How does this all relate to Supabase? There was a bunch of SQL syntax and understanding tables and rows and columns.

[12:33] How does this actually relate to working with Supabase practically in your applications? Supabase uses a tool called postgres and it's basically, I would say, the most popular SQL database in the world.

[12:45] If not the most popular, it is definitely there in the top two or three, and Supabase uses it under the hood. Supabase, in particular, manages a lot of stuff for you. It manages creating tables.

[12:57] It manages an auto-generated API for querying and modifying your tables, which we'll look at here in a sec. Then it also manages, importantly, something you don't think about when you go through the tutorials and all of the basics of SQL databases.

[13:12] It also manages permissions for users trying to access those tables. It does all of this for you. Instead, you just work with this part day-to-day. You might spend some time creating tables and saying like, "OK. This is what my data looks like," and stuff like that.

[13:26] Supabase really handles all of the raw SQL stuff, and instead day-to-day, you're going to spend a lot of time working with their auto-generated API, querying, modifying tables.

[13:37] Really you just get to focus on the core stuff like the real application functionality for your specific use case. That being said, you still should know some SQL when using Supabase. That's why I wanted to take a quick detour into introducing SQL.

[13:52] Things like, what is select? What does it mean to actually select a column? What is a row? What is a table? Understanding that mental model of all of those different things helps you better understand what tasks you need to do when it comes time to build a real application.

[14:07] There's also things like what type should I use for a user ID? What about names, right? We looked at things like ID being an integer or UUID names being text. We looked at timestamps and stuff like that.

[14:20] Just knowing those types can be really helpful to help you understand what things you should reach for in the underlying postgres database. Make sure to go check that out on the postgres website.

[14:31] The final thing that you should probably understand or at least have a basic knowledge of is indexes, which we'll cover here in just a sec. The reason for this is that Supabase queries look a lot like SQL, but with JavaScript syntax.

[14:43] This is pulled straight from my auto-generated API documentation for the project that we're going to make in this course. You can see that this is let data set to user an error. Well, this is clearly something coming back from an API request or something like that.

[14:59] When it comes time to actually use Supabase, right? This is our client here. You can see that this is very, very SQL-y looking, supabase.from user. Remember, we had a select that has this from thing. This is our table.

[15:13] Select from user.select some column, other column. This whole line of code here is very, very similar to the SQL statements that we were looking at earlier, but in a JavaScript syntax format.

[15:27] It can be really useful to understand not just the JavaScript part of it, but understand what it actually would translate to in SQL. It'll obviously make you a lot more comfortable working with the API.

[15:39] In case something goes wrong or you just need to go and diagnose things, just having that familiarity even if it's a small amount, just basic familiarity can be really, really helpful. Just as another example here, let's talk about inserting data, right? Here we're querying.

[15:54] When we come down here, we're selecting data and it's another very similar SQL-y kind of thing, right? We say .from user insert. Remember the INSERT command here. We can do things like pass in a set of columns or even pass in multiple rows here.

[16:10] We can insert multiple things at a time, right? This is an array, but at the end of the day this is all very, very SQL-y looking. Again, having that comfortability and knowing at the end of the day, this is just going to get turned into SQL somewhere down the stack, can be really useful.

[16:26] Just a couple other things that you should know about SQL before we wrap up the presentation. The first is what I mentioned earlier, these things called indexes. I'm going to try and cover this.

[16:35] Indexes are, well, a complicated topic, but I'm going to try and make it as basic as I can. They are basically fast column lookups. What does that mean? Well, let's talk about finding a user. If we say, select star from users where ID is equal to 1-2-3-4.

[16:54] Well, indexes provide quick lookups for important columns like ID. If we're going to be looking up users a lot by ID where we're going to constantly be saying, I don't know, a hundred times an hour or something like that, select star from users where ID equals X or Y or Z.

[17:11] Indexes basically make it so that postgres can be really fast about doing these sorts of lookups. In that way, it's a lot like hash tables or objects in JavaScript. Depending on your language background, they're called different things.

[17:24] In JavaScript we have, for instance, this object. It just opens this object here. Then we have these keys here, one or two. Then inside of those, we just have an object where name is Kristian or name is student.

[17:39] If you know anything about how objects work under the hood, or really how hash tables, or dictionaries, whatever you might call them in the language of choice, how these work underneath the hood, is that JavaScript will optimize how object is stored to make lookups really, really fast.

[17:55] For instance, console.log object where I pass in this key1 here, will return named Kristian. In that way, indexes, it's very simplified version of this. In a lot of ways, they are like hash tables, or dictionaries, or objects, where they allow you to do really fast lookups on specific values.

[18:18] These indexes relate to something called primary keys. IDs and any unique value can be called primary keys. What this means is that ID is a value that can't be empty and is the primary way that we lookup a user.

[18:34] It's our primary key. If we look at our public.user, we can see this is a listing of our entire table. Where ID is a UUID, username is text and it says nullable here. This is saying, can this field be empty or not? In this case, it says not null.

[18:53] Every row that comes into our user table has to have an ID that is not empty. What that means is that we can set it as the primary key. Meaning that any time you want to look up a specific user, the best and fastest way to do that is going to be via the ID.

[19:11] Your tables should basically always have a primary key. The majority of that time it's going to be an ID of some sort. Whether that's' ID or UUID or something like that, that's usually the best way to do it. Unless there is a clear and specifically unique second choice.

[19:30] Primary keys are indexed. We talked about indexing and then I went on this tangent about primary keys. Why does this matter? Why do primary keys matter? It's because they are indexed. But, you can also index basically any unique value.

[19:45] Just as an example, if you find yourself looking up users by email, maybe you look up not just ID, so I'm not just doing select star from users where ID equals one. Maybe I'm also often looking up via email, maybe that's because of an external system that I use as part of my business.

[20:03] You can also index email. Anything that's unique, you can index. The only thing you need to know is that having many indexes in your database and specifically on a table can have performance costs.

[20:18] If I index every single column on my users table, it can make it so that every single look up that I do on that table is really slow, because it has to maintain all these indexes, has to look through all of them, etc. Just make sure that it's worth it before you decide to index everything.

[20:35] If you find yourself doing that, indexing everything, it's out of scope for this presentation, but that's usually a sign that you haven't designed your database in a way that is super flexible. It's very unlikely that you want to actually index every single column in your database.

[20:52] Just keep that in mind. Often it will be really clear, the things that you might want to index. Things like, obviously ID is a great choice if that's your primary key. If you find that you're constantly looking up by email or by username or things like that, those are great candidates for indexing as well.

[21:10] Just a few more things you need to know more practically about working with postgres day to day. The postgres CLI is a really powerful tool for interfacing with your postgres database directly in your command line. Generally you can install this. So like on Mac, this is just brew install postgres if you use homebrew.

[21:29] On Linux, it's sudo apt-get install postgreSQL client. Though of course you can go to the postgres website and find specific platform instructions for basically anything.

[21:40] Once you have that installed, you can run the pg command. What that's going to do is it's going to run it locally. OK? So pg just says, hey connect to the local database that I have running on my machine.

[21:54] You may not have a database running on your machine though, for instance. With Supabase, you actually have it running somewhere else. It's an external URL that you want to access because Supabase is managing that postgres database for you.

[22:08] In that case, you can say, pg-h for the host name. You can say, run this on a remote postgres instance. There are instructions on Supabase's documentation for how to connect using the pg command line tool to your Supabase manage postgres database.

[22:27] Once you have that running, you'll see this postgres command line tool. You'll see this postgres command prompt here. This is obviously different, I kind of showed here that the dollar sign thing is pretty common on a bash shell or something like that if you're in the command line.

[22:43] Once you switch into postgres, you'll see this slightly different variation on that. What you can do here, is you can then begin just running basically any SQL query that you want.

[22:52] Select star from users, or insert, or anything else. In that way, I can execute queries against my database right from the command line. That's what I showed through this presentation, is literally running these commands in my own postgres CLI connecting to my Supabase installation.

[23:10] There are a couple of useful postgres CLI commands. I'm not going to dive too deep into all of them, but just two that you should know that I think are really useful. The first of which is just listing your tables. If I say \DT, and then I press enter, I'll get a list back of all of the tables in my database.

[23:31] Public message, which is going to be my message database that contains all my messages for my Supabase application, and then public user, which is a list of all of the users in my Supabase application. You can see they're both type table and they have an owner, which is Supabase admin.

[23:51] DT to get a list of tables. Then the other one here, it's the big output here, which I'll walk through in here in a sec is \D, which allows you to view a specific table. In this case, I say public.user. The reason for that, just to go back, is that to reference this table.

[24:11] It's going to be the schema plus the name. Our schema here is public. This is a public database that can be accessed via the API. Will be public.message or public.user. You've seen the public.user example here a couple times in this presentation.

[24:28] When it comes time to actually look that table up, I'll just say \D public.user. You can see I get table public.user here. Then I'm going to get three different things. The first is a list of all of the columns as well as some information about those columns on my user table.

[24:47] First, I have ID, which is a type UUID. Then it says here not null, as we talked about this ID value cannot be empty, it has to have some value. Then username, which is of type text. There's no nullable field here.

[25:02] There's no default. It's just a plain text field that can represent my username. Then we have an index. Our first index here is user P-Key, which is as you might guess, the primary key. It's stored as part of this data structure called a B-tree.

[25:19] Though specifically, the column here is an ID, right? We talked about indexing our primary key, which is an ID. You can see that represented here in the indexes section. The second part that we have is a referenced by. We haven't quite talked about this yet though.

[25:35] I will show you how this works in practice when we start working inside of Supabase. We have a separate table here called message. This is going to have a foreign key setup, which says message user ID, F-Key or foreign key. A foreign key points between two different tables.

[25:55] If I say foreign key user ID, which is on my message table, it references the user ID column. My messages, which have text in them and things like that, also need to belong to a user. To do that, I set up a user ID column which references this ID here on my user table.

[26:18] All of that is available here, just viewing this table schema inside of the postgres CLI. It's a super, super concise and really helpful way to understand how all of this stuff in your database fits together.

[26:32] The last thing before we move on to the rest of the course and get started actually building a cool, Supabase application is some resources. The first of which is postgresql.org.

[26:43] This is the really awesome postgres website that has all the information you need to know about available versions, as well as what I would say is probably one of the best documentation sites in all of software engineering, which has all of these different versions.

[26:58] Everything that you need to know about literally all of postgres, whether that is the different types like data types, or writing functions, or adding the command line interface, all of the stuff is covered inside of the documentation on the postgres website.

[27:15] It is probably the single best place to learn more about postgres. If you want to move from a beginner to intermediate or advanced SQL writer. The second is an egghead course from Brett Cassette that is called, Get Started With PostgreSQL.

[27:31] It covers a lot of the stuff that we talked about in this presentation. Things like creating a table, inserting data, filtering data, as well as more advanced things.

[27:41] Grouping aggregating data, sorting postgres tables, and even things like postgres's inner join, which allows you to find really interesting intersecting data in your postgres databases.

[27:53] The final part of this is the Supabase documentation for postgres. Supabase uses postgres under the hood. They have a lot of stuff about understanding how schemas work, how tables work, and all the things that I covered here about columns, and rows, and things like that.

[28:12] There is a number of really useful pages here in the Supabase documentation that augment that knowledge and augment that information. Definitely check that out as well.