Viewers: in countries Watching now:
Learn how to find and manipulate text quickly and easily using regular expressions. Author Kevin Skoglund covers the basic syntax of regular expressions, shows how to create flexible matching patterns, and demonstrates how the regular expression engine parses text to find matches. The course also covers referring back to previous matches with backreferences and creating complex matching patterns with lookaround assertions, and explores the most common applications of regular expressions.
In the last movie, we learned about the anchor metacharacters. And at the time, you may have noticed a small difference between the caret, and the backslash A, and between the dollar sign, and the backslash Z. They aren't exactly the same in their meaning because of how they handle new lines differently. In this movie, we'll learn why, and also learn how to apply these anchors to multi-line strings. First let's see the problem in action. So here in RegexPal, I just have a very simple shopping list that we've seen before, and I came up with the very simple, regular expression. Lowercase letters a to z, and a space inside a character set, repeated one or more times.
Now let's put an anchor at the beginning of this. Let's put the caret, and you can see that it matches, now, only milk. That's it. It didn't match each of those lines; it matched just the beginning of the entire string. Let's try it again, and at the end, let's put a dollar sign. Now it didn't match anything. What's going on there? Why didn't it match sweet potatoes? I have, actually, a line return after sweet potatoes, and if I take that line return away, now it matches. And that really is the secret to what the problem is here. It's because of those line returns.
When I put the line return character there, it then is an actual character, and the regex engine has to decide what to do with that character, and how it should handle it. We ran into the same problem with the dot. Should the dot match everything, including new line characters? We have that S option up here for that. It's a similar problem. There is an actual character that's invisible to us that goes right after each one of those lines, letting it know to put a line return in. So the regex engine is looking at this line for sweet potatoes, and it says, alright, I have O, E, S and now is the last thing here -- is this last thing after A to Z, and a space -- is it the end of the string? It's not.
I have got one more character there. That character is not the end of the string, so therefore I failed a match. It's only when we take that character away that it does match. This is called single-line mode, and by default, this is what regular expression engines use. They use single-line mode, and in that case, the caret and the dollar sign do not match at the line breaks. The same thing is true of the capital A and Z; they don't match at the line breaks. Many UNIX tools support only single-line. That's because in the early days when these tools were invented, they were invented as single-line tools.
And you'll remember that I told you that A and Z are not widely supported, and often can't be used in those UNIX tools as well. But over time, as things grew, single- line mode had to change, and we started using multi-line mode. People saw that there was a lot of utility in being able to deal with these lines as multiple lines. We can put our regular expression engine into multi-line mode. we will see how to do that in a moment. And once we do, suddenly the caret and the dollar sign will start matching at the start and the end of the lines instead, just like we might have expected them to do.
It works the exact same way. So that's it. It's a pretty simple thing to do once you have the concept, but it is an important concept. Otherwise, you may be thinking, well why in the world is this not matching? It ought to match it with the beginning and the end of line. Well, it's because you're not in multi-line mode. So just make sure that you take care when you're using those anchors to consider whether or not you're trying to match on those line returns, or whether you're trying to match the entire string all at once.
There are currently no FAQs about Using Regular Expressions.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.