Using Sets in ES6 to produce lists of unique objects is faster than using arrays, and less error prone than using objects. In this lesson, we explore the pitfalls of the object approach and the speed implications of the array approach. We will then instrument the array approach and the set approach to measure the number of operations each approach performs, and it's implications on program speed.
Instructor: [00:00] Before ES6 introduced the native set data structure, there were two main ways to emulate a set of unique values in JavaScript, one with objects and one with arrays. The object version is fast, in that it has an order-one lookup on the backing object using the in operator to determine if a value has present.
[00:17] It has an order-one assignment to add a value to the set. If we run this code, we see that it outputs true, because we've created a new object set, added a value to it, and then console.log whether or not the set contains the value.
[00:30] However, there's a drawback. It only works for string and number values, because every key of an object gets cast to a string first. If we check to see if this set contains value two, it will erroneously say true.
[00:43] This is because every key of an object is cast to a string first. If you cast an object to a string, it will always output object object, producing a false equivalence. To overcome this problem, the array version of sets were born.
[00:58] If we change our code over to the array version, you can now see that it correctly identifies the different objects, first outputting true, then outputting false. However, this is slow, because performance in order-N lookup via indexOf on the backing array to determine if a value is present.
[01:17] An order-N assignment, which implicitly does a lookup of the value to see if it's present first. Sets were added to JavaScript to overcome this specific problem, providing order-one lookups for values of any type.
[01:29] If we change our code to use native set, we could see that it still works. Now, you might be saying to yourself, "Yeah, the array version is slow, but is it really that slow?" Let's instrument our native set and array set code to determine the worst case scenario in terms of comparison operations performed.
[01:47] To our existing array set code, let's increment the number of array operations performed by the length of the array every time we do an as lookup. Let's trivially add one array operation every time we do an add.
[02:01] Second, we need to make an instrumented version of a set that extends the native set functionality, and simply add instrumentation. First, we'll override the has method, have the has method simply increment the number of set operations performed, and return the original implementation of super has value, which will call the has function of the original set.
[02:26] Let's do the same for add, incrementing the number of set operations performed, returning the super implementation of add. Let's instantiate a few of these sets. Next, we instantiate a new instance of our instrumented set.
[02:43] We'll fill it with 1,000 elements, first by calling array set.add(X), followed by set.add(X). Then we'll do the same for has. We'll console.log the number of array operations performed, and we'll console.log the number of set operations performed.
[03:10] The set operations version had 2,000 executions, where the array operations version had 1.5 million executions. This is what you will typically see when you have an n² algorithm, n² being n for the outer loop, and another n for the inner loop for array add, and another order-N operation in array set has. You could see the number of operations gets large very quickly.