Find the Start and End of Words with Regular Expression Word Boundaries

Joe Maddalone
InstructorJoe Maddalone

Share this video with your friends

Send Tweet
Published 7 years ago
Updated 4 years ago

Regular Expression Word Boundaries allow to perform "whole word only" searches within our source string.

[00:00] When we want to capture a whole word in a string, there's a number of different ways to do it. A very handy method is called a word boundary. We are going to create the string that says, "This island is his. It is."

[00:14] What we want to do is capture the word "is." You can see that "i" followed by "s" occurs a whole bunch of times in this string, and we just want to capture the actual word "is." Here's our [inaudible] , and we'll start off by just capturing "is," and we can see we're going to get that all over the place.

[00:37] We're technically capturing what we want, but we're also capturing patterns that we don't want. We could try to get a little tricky with this and say it's "is" followed by a white space. That gets us there, but it includes this guy and it doesn't include the "is" at the end. If we try to do something like make that optional, we're pretty much back where we started.

[00:59] The way we're going to do this is with a word boundary. The way we identify that is with a "/b." In this case, we're looking for words at this moment that start with "is." It's a whole word that starts with "is." We've got our two is's, but we've also got "island."

[01:19] What we can do is add another boundary at the end. What we're seeing is that this word starts and ends with "i," followed by "s." Now we've got our two is's in our string. We can also negate a word boundary by using "/B." What we're saying is we're looking for a word that contains "i" followed by "s."

[01:41] But "i" followed by "s" is not the beginning of the word. We get this and his. We could flip that around and do it at the end, so /B after "i" followed by "s" means we're looking for a word that contains "i" followed by "s," but "i" followed by "s" is not the end of the word, so we get island.

[02:03] We could also do this with something like "history," which contain "is," but it's neither the beginning nor the end. If we keep that at the negated end version, that's going to work. If we add the negated beginning version, we're going to get history, as well.