Published 5 years ago

Updated 3 years ago

What scores do the 90th percentile of students receive? How long does the 99th percentile web request take? Percentiles and arbitrary ntiles provider a richer understanding of underlying data and outliers.

[00:00] What if we want to find percentiles for our data? We can use the NTILE function, which accepts a number. A percentile is an NTILE consisting of 100 different buckets.

[00:15] Here, we'll partition by the school in order by the students' final grade from the students. What this will do is it will create 100 identically-sized buckets, or we'll try to be as close to identically-sized as possible.

[00:37] Here, we have the second percentile, the third percentile. Each of these consists of four students from GP. Since we're ordered by the final grade, we have to keep going for a little while considering all of these students got zeros, until we finally see that in the 9th percentile.

[00:55] We have some students with fours and fives. We keep going up. For GP, this works rather well, because we ultimately end up getting down to students at the tail end who have similarly-sized buckets.

[01:12] However, for MS, we don't quite have enough data. We only have enough data for 46 different buckets, which doesn't work very well as a percentile. In this case, we can see NTILE work again, and try to create 10 evenly-sized buckets.

[01:29] Here, each bucket will be much bigger for GP. As we get down to MS, we can see them being sorted relatively similarly to the way that they were being sorted when we use percentiles for the other school.