Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Repetition metacharacters

From: Using Regular Expressions

Video: Repetition metacharacters

In this chapter, we will learn to use repetition metacharacters to gain more matching power in our regular expressions. We're going to start by looking at the three main metacharacters: the star--or asterisk--the plus, and the question mark. Each one of these metacharacters has an effect on the item that immediately precedes it. That item could be a literal character, it could be a shorthand character set, it could be a more complicated expression that we haven't even learned yet, but it takes that preceding item and determines how many times that item can be repeated. In the case of the asterisk, that item can be there 0 or more times; in the case of the plus, it would be there one or more times; and with the question mark, the item would just be there 0 time or one time.

Repetition metacharacters

In this chapter, we will learn to use repetition metacharacters to gain more matching power in our regular expressions. We're going to start by looking at the three main metacharacters: the star--or asterisk--the plus, and the question mark. Each one of these metacharacters has an effect on the item that immediately precedes it. That item could be a literal character, it could be a shorthand character set, it could be a more complicated expression that we haven't even learned yet, but it takes that preceding item and determines how many times that item can be repeated. In the case of the asterisk, that item can be there 0 or more times; in the case of the plus, it would be there one or more times; and with the question mark, the item would just be there 0 time or one time.

Now, it may be weird to think about repetition in terms of something being there 0 times, but it does make sense. What we're talking about is really quantifying the fact that this item is repeatable or is not repeatable. Now, take a second and notice that out of these three metacharacters, only the plus sign says the item actually must exist. Both the asterisk and the question mark allow for the possibility that the item doesn't exist at all. Let's take a look at some concrete examples. Let's say I have a regular expression with the literal characters A-P-P-L-E-S followed by the asterisk. First, let's note that the asterisk only applies to the S, which is the immediately preceding item.

We'll learn ways later, if we wanted, to repeat the word apples. We could do that too. But, for now, it just matches the literal S. So it would match the strings apple, apples, and applesssssss with a whole bunch of Ss after it. So notice that it matches apple without the S because the S is optional. It's 0 or more times, so 0 is fine. One time is fine, two times, three times, four times, it doesn't have a limit on it. It says the item may or may not be there and if it's there, it could be repeated. Now, with a simple word like apples, maybe you're thinking well, that's kind of silly. Why would I ever want to look for apples with a whole bunch of Ss after it? But imagine that you had a file where some of the words had tabs separating them and you wanted to search for things that might have a tab between words.

It might not have a tab between words; it might have five tabs between words; you don't care about that in your matching; you still want it to match those words regardless of how many tabs are in between them. That's when you would use this. Let's compare that to the plus metacharacter. The plus matches apples and applesssssss with all the Ss, but not the single apple because the plus says that the S must exist. It must be there, but if it's there, it can be repeated. And then last of all, we have the question mark, which says that it can be there or not there--it's the optional metacharacter--but never repeated.

It's not repeatable; there can't be two of them; there can't be three of them, only 0 or 1. Now, as I said, we don't just have to put a literal character there; we could use a character class or character set. I'm going to use the shorthand character set \d and show you how you could match numbers with three digits or more using the star. Don't be tripped up here. Notice that we're talking about three digits or more, even though you see four of those \ds. Remember, that last one is optional. It's exactly the same as if we wrote it this way,\d\d\d, three of those \ds followed by the plus sign-- that's three digits or more.

Now typically, I would write it the second way, because I think that's clearer and you're less likely to make mistakes. But, I want you to see that they are the same thing. Also, don't be fooled into thinking that this means that the actual digit at the end has to be repeated. It's not 1233333333 and it has to be the 3 or it won't match; it could also be 123456789. It's the expression, or in this case the digit character set, that gets repeated. Then another classic use of the question mark metacharacter is to say that a letter is optional.

So for example, we match color either with or without the U by just saying hey! This u, question mark after it, it's optional. It may or may not be there; it would match it in both cases. Now, as far as support for these, they're supported in most regular expression engines. The one exception is that in really old UNIX programs, original programs like grep do not support plus and question mark, only the asterisk. So just keep that in mind. If you're working with really old UNIX programs, you may not have built-in support for this plus or this question mark, or you may need to use an extra option like grep has where you use a -e option to be able to use extended regular expressions.

Let's try some out. All right! Let's start with our very simple example here. Let's try apple, apples, and applesssssss. One, two, three, four, fix, six, seven. There we go! Now, for our regular expression, we're going to just put in apple. You see of course that just matches apple by itself. If I put in the S at the end, it just matches the apples. If we put the asterisk at the end, it matches all three of them. If I put in the plus, it matches just those last two because now the S is required or it won't match. And then the last possibility is that we put the question mark in. In this case, it matches the first two or it matches apples here, but not the whole thing.

So it does match a partial match, but not a full match. Let's try our digit example. Let's say that we have digit\digit\ digit, and let's use the asterisked version first, just so we can see that. 123456789 and then let's do 1234 and 123 and 12, just so we can see. So notice, it does match three digits or more, all right? It does match this one. Even though it's got four Ds up here, it does match these three digits here. Notice that none of these digits are repeated; it's the digit class that's being repeated each time.

If I put a plus here, now it's four digits or more. I take out one, now it matches three digits or more the exact same way. Let's try another example. This is one that wasn't in the slide, so take everything out of this. And let's put-in a-z+\da-z* and let's try and match abc9xyz, okay? So that matches; it makes sense. It's got characters that are repeated at the front, a digit, and then characters are repeated at the end.

What if we take away some of those characters? Let's say that we take away these first two. It still matches right, because the plus says that the first character is there one or more times. If I take it away though, now we no longer have a match. It's required to have at least one character before it. Let's try the other way. If we go back here, we take away two characters, no big deal, because it is there one or more times. If we take the Z completely away, we still have a match, because it is possible we told it, that it doesn't exist at all; it is still optional. Then last of all, let's just look at that color example, ?r, and then let's do colour or color. Both of them match.

Of course colouur with two U's doesn't match. All right! Let's try another one. Let's put a simple phrase in here, and we say, "We picked apples." And then up here for a regular expression, what we're going to do is look for any word character followed by a plus sign and then a literal S after it. Essentially, what we're saying is find words that end in S. It's a very simple way to just say find me all words that end in S. So we found, "We picked apples." It finds apples, and says, ah! That word ends in S. So you can see how repetition characters are very useful. And as we saw before, they do allow for unlimited numbers of items in there.

That still matches just as well. If we want to limit the number that it matches and put an upper limit on it and cap it, well then to do that, we need to use quantified repetition, and that's what we'll look at in the next movie.

Show transcript

This video is part of

Image for Using Regular Expressions
Using Regular Expressions

59 video lessons · 11655 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 26s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 30s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 38s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 46s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 31s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 10s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.