An index is a collection of similar documents, much like you would have similar records in a relational database. If you've been following along with this course, you've already been interacting with the Simpsons Index.
Creating an index is done with a Put operation. I'll specify the URL for my cluster and the name of the index that I want to create. In the body, I'll specify my settings for my newly created index. I'll set my number of shards equal to two and the number of replicas equal to one. When I send this, I get the acknowledgment back that the index has been created.
Now, if I do a get request on my cluster and look at the indices, you can see that my new index foo has been created with the correct number of primary and replica shards. Shards and replicas are vital components to a scalable and healthy index. An index can potentially store a large amount of data that can exceed the hardware limits of a single node.
Elasticsearch allows you to break up your index into these smaller pieces called shards. Each shard is a fully functional index that contains a subset of the total index. When all the shards are combined together represents your entire index.
Breaking the data up into shards allows you to spread your data across multiple servers for improved performance and capacity. The cool thing about it is, with Elasticsearch, you don't have to define how that's done. You just define the number of shards, and Elasticsearch does the rest for you. A replica provides a duplicate copy of your data. This ensures that the data is available if and when some of your backend services fail.
In this foo index we created, the index data will be broken up into two shards. Each shard is going to have one replica copy on a separate server, for a total of four shards. For this index to be completely healthy, that means I need at least two servers in my cluster -- one server for a given shard, and a second server for its replica.
One thing to note is that you don't have to specify these shards and replicas. We can create an index here with an empty document. We'll call it foo2 and send that. Again, Elasticsearch acknowledges the request. If we do a get request for the cat indices, foo2 has been created as an index. By default, Elasticsearch will create a new index with five primaries and one replica.
In the output, one thing you'll notice is that the health of all of my indices is yellow. That's because I don't have full redundancy. I'm running Elasticsearch on my local laptop with Docker, and I've only got one Elasticsearch node. That means that a second node to store the replica doesn't actually exist, so I can only achieve yellow status.
Under a yellow status, it means that all of the data from my indices is available for queries, but should something happen to any of my Elasticsearch nodes, that data won't be available. Let's look at a different example and see if I can explain that better.
Imagine if I have three servers with an index containing two shards and one replica. It might be distributed across my nodes like this -- one shard on server A, one shard and a replica on server B, and one replica on server C. If server C goes offline, our cluster will change to a yellow state, indicating that all of our data is currently available, because we still have it on server A and B, but we've lost redundancy with server C.
Elasticsearch will automatically start to build a new replica on server A or B to restore this redundancy, but imagine if server B goes down before that happens. Now we only have server A with a single shard, so our cluster status turns to red, indicating that all of our data is no longer available.
The important thing to remember here is that, if we query Elasticsearch, it's still a running service. We're going to get results back, but those results are going to be only based on the data that server A has available to it, not the entire data set from our full index. That's why monitoring and alerting on your cluster health is so important.