This lesson is for PRO members.

Unlock this lesson NOW!
Already subscribed? sign in

Javascript Regular Expressions: Find Sets of Characters

2:41 JavaScript lesson by

Regular Expression Character Classes define a group of characters we can use in conjunction with quantifiers.

Get the Code Now
click to level up

egghead.io comment guidelines

Avatar
egghead.io

Regular Expression Character Classes define a group of characters we can use in conjunction with quantifiers.

Character classes in regular expressions allow us to identify specific sets of optional characters that we're willing to accept as part of our matches. So we've got this string here "cat mat bat Hat ?at 0at" just to get us started let's match all that lower case As and Ts. So that works just fine. We know how to do that. Let's say that we're willing to accept any character before the lower a and t, that works just fine.

But now let's say that we're only willing to accept let's say c and b before the lower case a and t. We can create a character class. To do that, we place the optional characters we're willing to accept inside of brackets, square brackets. So I'm going to say [cb] for cat and bat. Save that, you can see it's only matched cat and bat, and one thing to mention here is the order is not important, so I'm switching it around to b and c, and we get the exact same results.

We can also negate a character class, so with the caret at the beginning of the optional characters what we're saying is we're not willing to accept b or c before the lower case a and t. So you can see it's matched everything except for cat and bat. We can also use character class ranges, so in our case we're going to do lower case, so I'm saying I'm willing to accept the character a through the character z, followed by the lower case a and t, and you can see that we've matched cat, bat, and mat, but not the uppercase hat, or any of the other patterns.

So if we wanted to bring in the upper case letters along with the lower case letters, we can union these ranges, we don't need any special syntax here, we just add it right after it, so upper case A to upper case Z, save that, and everything's working just fine there. These ranges don't have to be the full alphabet in this case. So if I didn't want to get mat I could say lower case a to d, and that's going to work just fine.

We can also negate these, so if I want to say I don't want to include any of the characters in these ranges, I can just add that caret right at the beginning of the character ranges, and now we've only got those last two there. If we want to only include digits we can do 0 to 9, or not only include, but if we want to also include digits we could do 0 to 9 and now we've got everything except for the one with the question mark at the beginning.

If we want to include that, all we need to do is union that character right into our character class. I will note that even though the question mark is a metacharacter in regular expressions, we don't need to escape it here, so I'll just save that. Now we're all so matching the question mark followed by the lower case a and t.

HEY, QUICK QUESTION!
Joel's Head
Why are we asking?