Surpass JS Performance with Optimized Collision Detection in WASM using a Linked List Grid

Guy Bedford
InstructorGuy Bedford
Share this video with your friends

Social Share Links

Send Tweet

To write a grid collision detection optimization in C requires thinking about the data structure of the grid more than in JS. In C we can represent this with a linked list and show how to convert the JS optimization code into this structure. When we compare the final performance, the WebAssembly code is now faster than the JS code - being able to write a low-level, more efficient data structure in memory is what gives WebAssembly its best advantage over JS. For those new to the principles, a good introduction to data structures and algorithms is the CLRS Introduction to Algorithms book.

Demo repo: https://github.com/guybedford/wasm-demo

Note the demo is currently only supported in Chrome Canary with the Experimental Web Platform flag (chrome://flags/#enable-experimental-web-platform-feature) enabled due to the use of ES modules and WebGL 2.0.

[00:00] On the right of the screen over here, I've got the optimized JS collision detection algorithm. On the left, we've got the C code for the unoptimized version. I'm going to bring across the optimizations from the JS version into the C code so we can get our optimized collision algorithm in WebAssembly to compare the performance.

[00:17] If you're not very familiar with C programming, please don't worry if you don't follow every single detail here.

[00:22] The first difference between the two is on the right. I've got this grid width and grid height being defined after the circle counts. I'm going to define those in the C code.

[00:34] The rest of the code for the grid optimization is in a time-step function. We create the grid, initialize the grid, and then we start assigning the circles that belong to that cell in the grid.

[00:47] Starting with the initialization, we can create a grid in C using a multi-dimensional array of the grid width and grid height. This will give us the same thing that we had in JavaScript, but the value of each grid cell in C can't just be an arbitrary array of circles in that grid cell because we don't have arbitrary length arrays in C.

[01:10] The legacy option would be to just assign the maximum amount of memory. We could say that there's never going to be more than a thousand circles in a given grid cell, and so we just make every grid cell an array of a thousand circles.

[01:24] The reference for each circle that we used in the JS code is that each circle was referenced by its integer index, so we'll have a thousand integers per grid cell. This is a terrible waste of memory, though.

[01:36] Instead, a better data structure that we can use is a linked list. The principle of a linked list is that instead of storing the index of the circle for each cell, we create a record data type which contains that index, and then also has a pointer to another record for the next record in that cell.

[01:55] The pointer is a memory reference to another record of exactly the same type. When we no longer point to anything else, we know we've reached the end of the list. This way, any number of individual circles can be referenced from a single grid cell.

[02:08] The cell circle records then need their own place to be stored in memory. I'm going to initialize a set length of them using an array. A given circle is never going to belong to more than four grid cells at a time due to the design on the grid size.

[02:21] The amount of memory I'll set aside for this is the number of circles times four, as we'll never have more records than that. The grid itself now becomes the first layer of our linked list. Each grid cell corresponds to a pointer into the first record. From that record, we can follow the next cell records to get the full list of circles in a given cell.

[02:40] That completes the data structuring. We can now move on to the initialization step.

[02:46] The loop over the grid cells remains much the same except we don't have to cache the columns anymore for performance. To set the individual grid cells to an empty list initially, I'm going to use the null pointer. This is a zero-memory reference. To use this, I'm going to have to include standard def.h from the header files.

[03:09] Next, we move on to the code that allocates each circle to a grid cell. I'll copy that indirectly and update it to represent valid C code.

[03:21] We determine the cell range of the current circle, and then loop through that range, assigning it to each of those grid cells. The assignment we'd like to be making here is to set a new record into the grid to represent this circle within the cell. In most cases, there'll already be a record in place.

[03:41] This is where the linked list comes in. We have to set the next value of the current record to the last record that was in place. We chain on and amend the list each time. The record also contains the index of the current circle.

[03:56] The cell circle record is instantiated as a struct. I've glossed over a detail here, in that the grid values were memory addresses of the individual cell circle records. This needs to be a pointer, and then we need to use the pointer struct to access for these assignments.

[04:12] What does this pointer point to, or we already allocated all the space in memory for the cell circles as the cell circles array? The pointer can be set to reference an exact record in this memory. Ideally, we want to allocate from the first record to the last.

[04:25] I'm going to create a counter that allows me to determine how many we've allocated, and then we know the next one is just at that count, which we can then increment each time.

[04:38] For the collision detection algorithm, I'm going to copy the loop over all of the grid cells from the JS optimized version and update that to the C code. Let me just correct those grid width and grid height names as well.

[04:58] When reading the grid cell, this is now the pointer to our cell circle record. We're going to instantiate that struct pointer. Looping over all the records means looping over those next values. Using a while loop, we can then keep updating the individual cell circle record to the next record in the list.

[05:22] Copying over the inner part of the loop, I'll update these valuables for the index position and velocity values of the current circle. For the loop over all of the next circles in the cell, I'm going to start with the JS cell circle equal to exactly the ISL circle.

[05:39] In the while statement itself, I'll update to the next value and check that it's not the null reference before continuing the loop. This way, the JS circle that we're comparing with is the one after the current circle, as it should be, and it's a little bit less code to write in the loop.

[05:54] With the optimization loops complete, lastly, we need to include the collision detection code, which is already in C. Once that's in place, the optimization is complete.

[06:11] Upping the circle count before compiling, let's now compare the performance. The WebAssembly version on the right is now faster than the JavaScript version on the left. It's our ability to create more efficient data structures in WebAssembly that gives us the true performance benefits.