An Introduction to DynamoDB Global Secondary Indexes (GSIs)

Chris Biscardi
InstructorChris Biscardi
Share this video with your friends

Social Share Links

Send Tweet

Global Secondary Indexes, or GSIs, are a great way to enable a variety of access patterns that a regular key/value document store might not be able to handle. GSIs allow us to create sparse indexes and choose what data is projected into our indexes. Using GSIs effectively is critical to building up single table designs.

Chris Biscardi: [0:00] Here we have a DynamoDB table, the partition key of PK, and a sort key of SK. There are three items in our table that are 18 bytes. If we look at indexes, we can see that we don't have any. We'll get back to that shortly. Our primary keys are our user IDs and our secondary keys are todo IDs. Our data is strings and our text values are todos.

[0:23] Using this schema, we can search for all of user one's todos and get them back. We can even search for a specific todo if we look for user one's todo number two. What we can't do is search for just todo number two without the user ID. To do this, we have to add a GSI or a Global Secondary Index.

[0:44] If we click on indexes, we're told that GSIs allow us to query efficiently over any field which is an attribute in our DynamoDB table. GSIs can treat any table attribute as a key, even attributes not present in all items. We'll cover what this means later.

[1:00] If you know what parse indexes are, GSIs on table attributes that aren't in every item, give us parse indexes. We can create an index and say that the partition key is going to be SK. It gives us an automatic index name of SK-index and we get to choose which attributes are going to be projected into this index.

[1:18] A secondary index is basically a copy of our data stored in a different way, that allows us to access it in a different way. In this case, we're going to project all of the attributes, but if we had some attribute in our todos that was really heavy or really big, we could only project the keys or specify exactly which attributes we want to get returned by that index.

[1:38] In this case, because this table is set up in a provision capacity manner, we have to specify the read capacity units and the write capacity units. We get an estimated cost as well. In this case, it's going to cost us three dollars a month.

[1:54] Reading indexes can take a while. You can create up to 20 global secondary indexes per table, but you'll probably never need more than 5 if you take the single table design approach.

[2:04] The time it takes to create a GSI on a preexisting table is dependent on a variety of factors including the size of the table, the number of attributes projected into the index and write activity on that current table. Now the index is created. Let's go back into our items.

[2:18] If we go to query and we choose our index, you can see that we only have one value to put in. If we put in todo number four and we start our search, we get back to user number two todo number four, I have to do something.

[2:32] This method of creating GSIs to enable alternate access patterns is quite powerful. It allows us to specify a partition key and a sort key that can be made out of any of our attributes. It also lets us take specific attributes and move them into that index depending on what we actually want back from that index query.

[2:49] Finally, if an attribute doesn't exist, then it won't get indexed. If you had a global secondary index that specified data as our partition key and we created another item that didn't have data, it wouldn't get indexed which allows us to create smaller indexes that we can query faster and also store multiple types of items in the same table.