Use ES6 Sets to Improve Javascript Performance

Mike Sherov
InstructorMike Sherov

Share this video with your friends

Send Tweet
Published 3 years ago
Updated a year ago

Using Sets in ES6 to produce lists of unique objects is faster than using arrays, and less error prone than using objects. In this lesson, we explore the pitfalls of the object approach and the speed implications of the array approach. We will then instrument the array approach and the set approach to measure the number of operations each approach performs, and it's implications on program speed.

Instructor: [00:00] Before ES6 introduced the native set data structure, there were two main ways to emulate a set of unique values in JavaScript, one with objects and one with arrays. The object version is fast, in that it has an order-one lookup on the backing object using the in operator to determine if a value has present.

[00:17] It has an order-one assignment to add a value to the set. If we run this code, we see that it outputs true, because we've created a new object set, added a value to it, and then console.log whether or not the set contains the value.

[00:30] However, there's a drawback. It only works for string and number values, because every key of an object gets cast to a string first. If we check to see if this set contains value two, it will erroneously say true.

[00:43] This is because every key of an object is cast to a string first. If you cast an object to a string, it will always output object object, producing a false equivalence. To overcome this problem, the array version of sets were born.

[00:58] If we change our code over to the array version, you can now see that it correctly identifies the different objects, first outputting true, then outputting false. However, this is slow, because performance in order-N lookup via indexOf on the backing array to determine if a value is present.

[01:17] An order-N assignment, which implicitly does a lookup of the value to see if it's present first. Sets were added to JavaScript to overcome this specific problem, providing order-one lookups for values of any type.

[01:29] If we change our code to use native set, we could see that it still works. Now, you might be saying to yourself, "Yeah, the array version is slow, but is it really that slow?" Let's instrument our native set and array set code to determine the worst case scenario in terms of comparison operations performed.

[01:47] To our existing array set code, let's increment the number of array operations performed by the length of the array every time we do an as lookup. Let's trivially add one array operation every time we do an add.

[02:01] Second, we need to make an instrumented version of a set that extends the native set functionality, and simply add instrumentation. First, we'll override the has method, have the has method simply increment the number of set operations performed, and return the original implementation of super has value, which will call the has function of the original set.

[02:26] Let's do the same for add, incrementing the number of set operations performed, returning the super implementation of add. Let's instantiate a few of these sets. Next, we instantiate a new instance of our instrumented set.

[02:43] We'll fill it with 1,000 elements, first by calling array set.add(X), followed by set.add(X). Then we'll do the same for has. We'll console.log the number of array operations performed, and we'll console.log the number of set operations performed.

[03:10] The set operations version had 2,000 executions, where the array operations version had 1.5 million executions. This is what you will typically see when you have an n² algorithm, n² being n for the outer loop, and another n for the inner loop for array add, and another order-N operation in array set has. You could see the number of operations gets large very quickly.

Thomas Dillard
Thomas Dillard
~ 3 years ago

Subjectively the more important thing here might be the ease of adding decorators in ES6. Great video.

Doma
Doma
~ 3 years ago

What if using console.time to test looks more accurate?

Mike Sherov
Mike Sherovinstructor
~ 3 years ago

Hi Darren,

Thanks for the feedback! Console.time would be good too. In this specific case though, the actual run time difference is negligible because 1.5 million operations is still decently fast, and modern JS engines do tricky things to speed up things further for simple loops. In reality, Set usage is likely more complicated than simple iteration.

My intention was to show how quickly the number of operations grows for O(n^2) algos vs. O(n) to give the viewer a more intuitive sense of the numbers, and less about the actual performance of the contrived example.

Doma
Doma
~ 3 years ago

Hi Mike,

So do you recommend to use set method store data instead of using array? I found set api can easier convert to array via Array.from method and it has its own forEach method for looping. Or it depends on different data structures. Thanks for response.

Mike Sherov
Mike Sherovinstructor
~ 3 years ago

Sets can only represent unique sets of elements, and compared to Arrays, have a smaller, less powerful set of APIs. I typically only Sets when I have that unique requirement.

Doma
Doma
~ 3 years ago

Okay Mike, Thank you for explanation. Very helpful.

Ajay Kumar
Ajay Kumar
~ 3 years ago

Hey Mike, Thanks for the Video. Shouldn't we use (setOps = setOps + this.size) to compute the setOps in has() as internally set might be iterating over all the elements to find if a value already exist in set or not?

Mike Sherov
Mike Sherovinstructor
~ 3 years ago

Hey Mike, Thanks for the Video. Shouldn't we use (setOps = setOps + this.size) to compute the setOps in has() as internally set might be iterating over all the elements to find if a value already exist in set or not?

Good question! Set (and Map) has been specifically designed to have O(1) lookups and does not need to iterate over all its elements to do so.

Ajay Kumar
Ajay Kumar
~ 3 years ago

Hey Mike, Thanks for the Video. Shouldn't we use (setOps = setOps + this.size) to compute the setOps in has() as internally set might be iterating over all the elements to find if a value already exist in set or not?

Good question! Set (and Map) has been specifically designed to have O(1) lookups and does not need to iterate over all its elements to do so.

Found algorithm of has in ES6 doc file. https://www.ecma-international.org/ecma-262/6.0/#sec-set.prototype.has . I hope I am looking in the right direction.

Mike Sherov
Mike Sherovinstructor
~ 3 years ago

Found algorithm of has in ES6 doc file. https://www.ecma-international.org/ecma-262/6.0/#sec-set.prototype.has . I hope I am looking in the right direction.

Almost :-). See here: https://www.ecma-international.org/ecma-262/6.0/ where it says:

Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection. The data structures used in this Set objects specification is only intended to describe the required observable semantics of Set objects. It is not intended to be a viable implementation model.

“Hash tables” and “sublinear access” are the key words there.

Ajay Kumar
Ajay Kumar
~ 3 years ago

Got It! :-). Thanks for your quick revert.

Tre' Codez
Tre' Codez
~ 3 years ago

This was awesome! I like you explained it!