Use Character Classes in Regular Expressions

Joe Maddalone
InstructorJoe Maddalone
Share this video with your friends

Social Share Links

Send Tweet
Published 8 years ago
Updated 5 years ago

Regular Expression Character Classes define a group of characters we can use in conjunction with quantifiers.

[00:00] Character classes in regular expressions allow us to identify specific sets of optional characters that we're willing to accept as part of our matches. So we've got this string here "cat mat bat Hat ?at 0at" just to get us started let's match all that lower case As and Ts. So that works just fine. We know how to do that. Let's say that we're willing to accept any character before the lower a and t, that works just fine.

[00:28] But now let's say that we're only willing to accept let's say c and b before the lower case a and t. We can create a character class. To do that, we place the optional characters we're willing to accept inside of brackets, square brackets. So I'm going to say [cb] for cat and bat. Save that, you can see it's only matched cat and bat, and one thing to mention here is the order is not important, so I'm switching it around to b and c, and we get the exact same results.

[00:59] We can also negate a character class, so with the caret at the beginning of the optional characters what we're saying is we're not willing to accept b or c before the lower case a and t. So you can see it's matched everything except for cat and bat. We can also use character class ranges, so in our case we're going to do lower case, so I'm saying I'm willing to accept the character a through the character z, followed by the lower case a and t, and you can see that we've matched cat, bat, and mat, but not the uppercase hat, or any of the other patterns.

[01:33] So if we wanted to bring in the upper case letters along with the lower case letters, we can union these ranges, we don't need any special syntax here, we just add it right after it, so upper case A to upper case Z, save that, and everything's working just fine there. These ranges don't have to be the full alphabet in this case. So if I didn't want to get mat I could say lower case a to d, and that's going to work just fine.

[01:58] We can also negate these, so if I want to say I don't want to include any of the characters in these ranges, I can just add that caret right at the beginning of the character ranges, and now we've only got those last two there. If we want to only include digits we can do 0to 9, or not only include, but if we want to also include digits we could do 0to 9 and now we've got everything except for the one with the question mark at the beginning.

[02:22] If we want to include that, all we need to do is union that character right into our character class. I will note that even though the question mark is a metacharacter in regular expressions, we don't need to escape it here, so I'll just save that. Now we're all so matching the question mark followed by the lower case a and t.

egghead
egghead
~ 10 minutes ago

Member comments are a way for members to communicate, interact, and ask questions about a lesson.

The instructor or someone from the community might respond to your question Here are a few basic guidelines to commenting on egghead.io

Be on-Topic

Comments are for discussing a lesson. If you're having a general issue with the website functionality, please contact us at support@egghead.io.

Avoid meta-discussion

  • This was great!
  • This was horrible!
  • I didn't like this because it didn't match my skill level.
  • +1 It will likely be deleted as spam.

Code Problems?

Should be accompanied by code! Codesandbox or Stackblitz provide a way to share code and discuss it in context

Details and Context

Vague question? Vague answer. Any details and context you can provide will lure more interesting answers!

Markdown supported.
Become a member to join the discussionEnroll Today