⚠️ This lesson is retired and might contain outdated information.

Paginate through search results in Elasticsearch

Will Button
InstructorWill Button
Share this video with your friends

Social Share Links

Send Tweet
Published 7 years ago
Updated a year ago

By default, search results are limited to the top 10 results. In this lesson, you will learn how to change the pagination size, paginate through results, and you will learn about the performance implications of pagination on Elasticsearch.

[00:00] If we look at this query search in "The Simpsons" index of type "Episode," it tells me that there's a total of 600 documents that match the search in The Simpsons episode index, but we only saw the first 10 in the results.

[00:14] I can modify that query, and I can change the number that it returns, and set size equal to five. This is going to cause it to return five documents. There's one, two, three, four, five, and that's it.

[00:32] If I want to see the next five documents that matched, I can include that parameter -- and it's called from -- and say starting at number five, return the next five, and we get those results back as well.

[00:45] Same thing if we wanted to iterate further through it. We could start at 10, so we start with the tenth document. You can use that to paginate through your results, but there is a performance limit to this.

[00:56] It has to do with the way that Elastic Search retrieves and sorts results. The query results are retrieved from individual shards within your cluster, then they're sorted and returned.

[01:08] Imagine that we're searching a single index with five primary shards. The initial search for the top 10 results causes each shard to generate its own top 10 results, and those are sent back to the coordinating node for a total of 50 results. Then the coordinating node sorts those 50 results and returns the top 10 back to you.

[01:31] Now, imagine asking for page 1,000 to get results numbered 10,000 through 10,010. Each shard is going to produce its top 10,010 results, send those back to the controlling node which sorts all 50,050 results, discards 50,040 of them to return the top 10 to you.

[01:52] As you scale, the cost of sorting grows exponentially. That's something to keep in mind as you grow your Elastic Search cluster.

Greg Miller
Greg Miller
~ 5 years ago

Closed captioning is on again now.

Markdown supported.
Become a member to join the discussionEnroll Today