Find Patterns at the Start and End of Lines with Line Anchors in Regular Expressions

Joe Maddalone
InstructorJoe Maddalone
Share this video with your friends

Social Share Links

Send Tweet
Published 9 years ago
Updated 6 years ago

We can identify the start and end of a line using Line Anchors. When dealing with multiple line matches we can utilize the multiline regular expression flag.

[00:00] When we want to capture the beginning or end of a line we can use something called a line anchor. So for example here I'm going to set up a string that just represents a date, and in our regex let's say we're just going to try to get the 12, so cool, we've got the 12. Let's go ahead and put another date after this and say that's going to be 12-16-13. Again, we're only to get the one, so if we want to get the second one, we've got it. But let's say we only one to get the one at the beginning of the line.

[00:34] We can do that at the very beginning of our regular expression with a caret. Now we've used carets in the past inside of character classes where they represent the negation of whatever characters are inside of that character class. Outside of a character class a caret is used as a line beginning operator. So if I save that you can see that we only get the 12 at the beginning of our line. Now let's go ahead and break this into two lines.

[01:01] We've got our line beginning anchor followed by 12, and we've got global on, so the expectation here is probably that we're going to highlight both of the 12, and that's simply not what happens. To illustrate what's going on here we could do a regex.exec string bring that up in the dev tools, and what we'll see is that the input is treated as the single line that just happens to have a line break in it. We could recreate that by manually adding the line break with the \n, and we're going to get the exact same result.

[01:38] Let's go ahead and put that back on two lines, and the way we're going to get to this is by adding a new flag that is identified by the letter m, and that is the multiline flag. So we save that and we can see that we've actually identified by of the 12s that begin each of our lines. What the multiline operator does, or the multiline flag does is it says a line can't include a line break. It essentially finds any line break followed by any character we identify as a new line. So we've got that in place, let's add a couple more dates here.

[02:13] So 11-12-16, and 12-12-2016, so that should be sufficiently confusing. We're identifying all the months of 12 in our data, but let's say we only want to get the ones that occur in December, represented by 12 in the year 2016. So we could say any number of characters, one or more characters followed by 16, and we're pretty close, we've got the two dates we're looking for, but we're also including this guy right here, which is 12 followed by any number of characters and then a 16.

[02:52] Since we know that we want that character, or that character set of 1 followed by 6 at the end of the line, we can use the end line operator which is represented by $. So now we're saying that the line should begin with 12, be followed by any number of characters, and end with 16. So we saved that, we now have the beginning and end line operators that we're looking for, and we've identified 12-1-16 and 12-12-16.