Viewers: in countries Watching now:
Learn how to find and manipulate text quickly and easily using regular expressions. Author Kevin Skoglund covers the basic syntax of regular expressions, shows how to create flexible matching patterns, and demonstrates how the regular expression engine parses text to find matches. The course also covers referring back to previous matches with backreferences and creating complex matching patterns with lookaround assertions, and explores the most common applications of regular expressions.
In this movie, we're going to take a look at the grouping metacharacters. And aside from working with literal characters, the grouping metacharacters are probably the easiest characters there are. It's just simple parentheses. That's all of this. We put parentheses around things that we want to group together. It's very a commonsense approach. Now, why would we want to do this? Well, there are a couple of reasons. One is that putting these metacharacters around either a couple of different characters or a couple of different character sets allow us to apply repetition to that group. Now before, remember when we were doing repetition, we were repeating just a single element--either a single character, a character class, character set, and we were repeating it.
Now, we can group several of those together and repeat the group as a whole. That's simple, but it's actually really powerful. The second reason why you might use them is that they can make your expressions easier to read in some cases and then last of all, they capture the group for use in matching and replacing. What that means, essentially, is that the regular expression engine remembers the group for use later on. Now, that's a more advanced thing that we're going to come back to when we get to backreferences. But, for now, just keep in mind that it does sort of mark this group for the regular expression engine so that it can make use of it.
Now, one important point about these metacharacters is that they cannot be used inside a character set. There is no reason to group anything inside a character set. The purpose of a character set is to define a set of characters. We can put groups around character sets, we can group it around several character sets, but not inside; inside they have their literal meaning. So let's take a look at some examples. So let's just say we have the simple characters a, b, and c. Well, if we just put the plus at the end, that would repeat the c. It would match abccccc. But, if we want to match abcabcabc, we group them together and then apply our repetition operator.
See how that works? We can also do the same thing with our optional characters, using the question mark. So if we can match both dependent, and independent by putting (in) inside of a group and then apply the question mark after it to let it know that it's optional, so that should match both independent and dependent. I was saying that you can make your expressions easier to read. It's up to you, but some people may find it clearer to have run(s) and have that S be optional and have it inside a group than it is to just have "runs?" That's really a matter of personal taste.
Let's try these out. Okay. To start with, I'm just going to paste in a string of text here which has got A1B2C3D4E5 and so on--you see the pattern. And for our regular expression, let's just put in capital letters A to Z inside a set, and let's put in 0 to 9 inside a set. So capital letters followed by a digit. So you can see what it matched up there for me. Now let's put the parentheses around it. Now, you can see that that did nothing to it. It did not change its meaning at all. All it did was actually tell the regular expression engine, hey! Remember this from later. I might use it.
But, we haven't gotten to that. We haven't talked about that behavior. For now, in terms of matching, it didn't match anything different. If we now put a plus after it, and now you can see that it matches the entire string, because it says ah! This can be repeated. You can see it even better if we tell it, well, just match it 3 times. So find 3 of these, a letter and a number 3 times, and then it starts over. So the first one is in yellow, next one is in blue, next one is in yellow--sets of three of each of these. Let's try the other example we had. Let's have a dependent or independent.
And for our regular expression, let's put in (in)dependent. The parentheses here don't make a difference. They don't actually change it. But then when I put the question mark now, now both the I and the N are optional. That's very different than if I take away that. That doesn't match it. Now, the (in) is optional. So I do need those parentheses there so that the group is optional. And as I said, with runs, it's really just a matter of you. I run fast. He runs faster. Let's change that and let's just make it "runs?" It matches both of them, or if you put the parentheses, question mark after it.
So it's really just up to you which one you find clearer; they both do the exact same thing. Now, as I said, we're going to come back and talk more about capturing groups a little later on when we talk about backreferences, but groups are also going to be helpful for working with alternation, and that's what we're going to look at in the next movie.
There are currently no FAQs about Using Regular Expressions.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.