Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Lazy expressions

From: Using Regular Expressions

Video: Lazy expressions

After our discussion in the previous movie about how regular expression engines choose greediness by default, you may be wondering, could we make the regular expression engine make a set of different choices? And we can do that but to do it, we need to introduce a metacharacter--now not a new metacharacter but one we've already seen before, the question mark. Previously we learned the question mark was the modifier that said the preceding item occurred either zero or one time. Now in this context, this question marks makes the preceding quantifier into a lazy quantifier.

Lazy expressions

After our discussion in the previous movie about how regular expression engines choose greediness by default, you may be wondering, could we make the regular expression engine make a set of different choices? And we can do that but to do it, we need to introduce a metacharacter--now not a new metacharacter but one we've already seen before, the question mark. Previously we learned the question mark was the modifier that said the preceding item occurred either zero or one time. Now in this context, this question marks makes the preceding quantifier into a lazy quantifier.

So what that means is that we stick it right after all those qualifiers that we've seen before. So we have a star and a question mark, plus and a question mark, our curly braces with a question, or question mark question mark. The asterisk question mark is actually called the lazy star, and the plus question mark is nicknamed the lazy plus, so you can call them those for short. And all of these had the exact same effect that they did before, the same meaning. The only difference is that instead of adopting a greedy strategy, now they're going to adopt a lazy strategy for making their choices.

Remember, the greedy strategy matches as much as possible before giving control over to the next expression part. The lazy strategy does the opposite; it says matches little as possible before giving control to the next expression part. It still differs to the overall match just like the greedy one does, and it's not necessarily any faster or slower to choose a lazy strategy or a greedy strategy, but it will probably match different things. Now as far as is faster or slower, it's a little bit like saying, if you've lost your car keys and your sunglasses inside your house, is it better to start looking in the kitchen or to start looking in the living room? You don't know which one's going to yield the best result, and you don't know which one's going to find the sunglasses first or the keys first; it's just about different strategies of starting the search.

So you will likely get different results depending on where you start, but it's not necessarily faster to start in one place or the other. So let me just show you this lazy quantifier in action, just so you can sort of see how it works in context. These look just like expressions we had before; we've just stuck the question mark after each one. Now the last one, I want you to pay particular attention, to apples??. It's totally a valid regular expression to do that. In the first instance that question mark is saying that the S can occur zero or one time.

By default it's greedy, so it's going to prefer one time; it's always going to prefer, if we have the string apples, it's going to want to take a-p-p-l-e-s and be greedy about it, not just to stop at the e. The question mark after it though tells it to be lazy, which says if you see the word apples, then I want you to just take a-p-p-l-e. So essentially then we've made the s meaningless, because when is there ever a case when you'd be able to find that a-p-p-l-e and fail and still need to get that s afterwards? There is never an instance.

That's really meaningless to have it, and some tutorials leave it out altogether. So I just want to show you it so that you see it, but it actually can be completely omitted and ignored. Now as far as support goes, most regular expression engines support this lazy quantifier, but UNIX tools are always greedy. BREs and EREs, all of those tools do not support it. It was really something that came along with Perl, Perl-compatible regular expressions. So most programming languages and regular expression engines that are out there will allow you to adopt a lazy strategy instead of a greedy one.

So let's look at how the regular expression engine parses it, just like we did before. We have the same string, Page 266, and the same regular expression, but I have put the question mark after the star. That makes it a lazy star. So it's wildcard lazy star, any digit one or more times. So what happens is the regular expression engine starts out at the beginning, and actually, before it even goes to the first letter, it says, oh you know what this wildcard is zero or more times. It could be zero. Boy, that would be really lazy. I could get away with doing no work at all. Let me first just try doing nothing and see if the next part of the expression can take over and do all the work for me.

So the next part of the expression then goes to the P and says nope, that's not a digit. So then the wildcard says okay, okay, I'll try and match it. Oh, it does match for me. All right, so I'll take the P. All right, I am done now. I am going to take a break. Next part of the regular expression, you do your job. So it goes to the a and it says is that a digit? No, that's not. So therefore, so it says, okay I'll get up and do a little bit more work. I'll take the P and the A--those both match me, the wildcard. Now, it asks once again, hey digit, can you take over? The digits, or whatever the next part of the expression happens to be, takes over and tries again.

And it keeps doing that until it keeps going along the string, and finally then when it gets here, then the digits start matching. The digit says, oops! I found it a 2. That works for me. So now the wildcard is equal to page space, and the digits say ah! This works for me, this matches me, the six matches me, and the next six matches me; therefore I now have a match. So in this case it still matched exactly the same thing, but it did it by using a different strategy. One note of caution I want to give you is that if you were to take this same expression but you changed the plus sign instead to be a lazy star as well, what do you think it would match in this case? Both of them are being lazy.

Take a second and think about it. What happens is the first one says, oh I can get away with matching nothing, so I match zero. Then it goes to the next one and says, hey, why don't you take over? The 0-9 says oh, I can get away with matching nothing too, so then it says, go along to the next regular expression, and both of them have succeeded in fulfilling their requirements. So it says I made a match, but you know what your matched result was? Nothing. It matched absolutely nothing. So you have to be careful. If everything in the expression is optional then it's going to match nothing at all.

And just to round this out, let's take one last look at those same samples that we were looking at in the last movie where we had an Excel file or we had a comma-delimited file of names and companies. And let's look what would be matched if you used lazy expressions instead of using greedy expressions. So notice that I have now got a lazy plus after the word character in the first example and a lazy plus after each of those wildcards in the second example. So what does it match now? Remember before, the first one match the entire thing except for the .xls.

Well now, it matches just the first part, because that word character tries to give up as quickly as it can and see if it can get a match. Then in the second example, instead of having the entire string be matched, now it says, all right, I am going to try the first one, but I am going to give up as quickly as I can and see if the next one can do it's the thing. So again, two different strategies. One is not necessarily better than the other, but they do often yield different results, so it's something that you definitely need to be mindful of as you are constructing your regular expressions.

Show transcript

This video is part of

Image for Using Regular Expressions
Using Regular Expressions

59 video lessons · 11676 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 26s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 30s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 38s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 46s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 31s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 10s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.