Viewers: in countries Watching now:
Learn how to find and manipulate text quickly and easily using regular expressions. Author Kevin Skoglund covers the basic syntax of regular expressions, shows how to create flexible matching patterns, and demonstrates how the regular expression engine parses text to find matches. The course also covers referring back to previous matches with backreferences and creating complex matching patterns with lookaround assertions, and explores the most common applications of regular expressions.
Now that we've learned about capturing group expressions, and backreferences to those groups, now I want us to talk about non-capturing group expressions. Remember I told you that by default, whenever we put parentheses around an expression, it gets automatically captured by default. So what we are going to do now is we are going to tell the regex engine to turn off that default behavior. And we are going to do that with two new metacharacters together: the question mark, and the colon. Now, notice this question mark is in different context than when we've seen it before.
We saw it as the optional character, and we saw it as the non-greedy character. We are seeing it here a third use for it. Question mark, colon is going to specify that it should be a non-capturing group. And the way that that's going to work in terms of syntax is that a group, like a word character repeated inside parentheses, would just be question mark, colon, and then the word character repeated, all inside of parentheses. So it's important that the first two characters after the beginning of the group is the question mark, and the colon.
That tells it it's a non-capturing group. That turns off the capture, and the backreferences. Now, why would you want to do this? Well, for one thing, you can optimize your regex for speed. The second reason why is it can preserve space for more captures. Some regex engines allow us to have up to 99 captures, but a lot of them limit us to nine. So if we just have those nine, then we want to be very efficient about how we use them, and make sure we don't accidentally waste one on a group expression that we don't need. Now, as far as support for these goes, most regex engines will support them, except for the older UNIX tools.
Essentially, this was an invention that came along with Perl; Perl compatible regular expressions. Everything from Perl forward does have support for these. So it's a pretty simple concept. It's just the syntax that takes a little bit of remembering. Let's try a couple of examples, though, to make sure. So for starters, let's say in RegexPal, we have oranges and apples to oranges, and then for our regular expression, let's just copy that, and paste it up here, and then let's put parentheses around oranges, and apples.
So now it's going to capture both of those. If I put a backreference here to number one, you see we get a match, right? Because that's oranges. Backreference 2 is apples, and that doesn't match. Let's go ahead and just copy this, and let's make the second one be apples; there we are. So you see it matches that one; the other one matches the first one. If we make this a non-capturing group, now what gets matched? The second one does, because this doesn't get captured, so register one -- the spot held by number one -- is the first captured group, which is apples.
If we do two, of course it matches nothing, because there is nothing there; nothing has been saved in that spot. Now, it's important for you to note here that when you have this syntax -- the question mark, colon -- that actually the question mark, the meaning of that in this context is, give this group a different meaning. And then after that, the colon says, that meaning is going to be non- capturing. That's what we are about to do here. And the reason why I want to point it out -- that you can look at it in those two parts -- is because in the next chapter we are going to be looking at other things that share the same syntax.
It uses the question mark to say, give this group a different meaning, and then immediately after that, another modifier, which tells it what that meaning is going to be. So if you think of it in those two parts, the next chapter on lookaround assertions is going to be easier to follow.
There are currently no FAQs about Using Regular Expressions.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.