Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Positive lookahead assertions

From: Using Regular Expressions

Video: Positive lookahead assertions

In this chapter, we will be talking about lookaround assertions. Lookaround assertions are made up of two main types: lookahead assertions, and lookbehind assertions, and those are further divided into positive, and negative. In this movie, we'll start by examining positive lookahead assertions. The metacharacters that we are going to use to define a positive lookahead assertion are the question mark, followed by the equals sign, and you will use these inside a grouped expression as the first characters inside the parentheses. A lookahead assertion is an assertion of what ought to lie ahead.

Positive lookahead assertions

In this chapter, we will be talking about lookaround assertions. Lookaround assertions are made up of two main types: lookahead assertions, and lookbehind assertions, and those are further divided into positive, and negative. In this movie, we'll start by examining positive lookahead assertions. The metacharacters that we are going to use to define a positive lookahead assertion are the question mark, followed by the equals sign, and you will use these inside a grouped expression as the first characters inside the parentheses. A lookahead assertion is an assertion of what ought to lie ahead.

We're telling the regular expression engine, essentially, take this grouped expression, and look ahead, and see if you can find a match. If our lookahead expression fails, well then the whole match will fail. But if not, then the engine will keep going, and see if it can make a match out of everything else in the expression, and we can use any valid regular expression inside the lookahead assertion. The most important point about these assertions, though, is that they are zero-width. Remember we saw that anchors and word boundaries are zero-width matches as well? They refer to a position, rather than to an actual character.

The same is true with assertions. The assertion will return true or false about whether it makes a match, but it does not actually match any characters. That's why they're called assertions. They just assert that something ought to be true about the match, but without doing anything else. Lookahead assertions are going to be supported by most modern regular expression engines. Perl first introduced them, and since then, Perl-compatible regular expression engines typically support them. Engines developed prior to Perl, like the UNIX tools developed during the 1970s though, do not.

So as I said at the start, we define a lookahead assertion by using a question mark, and an equals, inside a grouped expression. That's very similar to what we used for a non-capturing group, but with an equal sign instead of a colon. And just like the non-capturing group, the purpose of the question mark is to indicate that the group has a different meaning than normal. Then it's the second character that comes after the question mark that defines what kind of special meaning it will have. The equal sign is what defines that it will be a lookahead assertion. The equals sign is easy to remember, and makes sense as being positive, because what we are saying is that our expression should be equal to something.

Be careful not to put a space after the equals sign, though. It may seem more readable, but that space has meaning, and becomes part of the regular expression. It should just be question mark, equal sign, and then immediately the regular expression you want to assert. Let's look at some examples. So let's say that we have an assertion that we should have a match for seashore -- that's inside our group -- and then after that, we have the literal characters S, E, A. So if we have the string seashore, then what happens is the first thing is the regular expression engine says, alright, let me look ahead; I should find a match for S, E, A, S, H, O, R, E.

Great! It does have that match. It passes the assertion. So since the assertion passes, now it proceeds to the second part of the expression; the non-grouped part, which is just the characters S, E, A. It does match those, and that's the part that matches; not the whole word, not seashore, just S, E, A. If we tried the same thing with seaside, the assertion would start running, and it would say, I am looking for this pattern. It looks for seashore, it says nope, it fails, and it just stops right there, and says not a match, and never attempts the second part of the expression.

Now, you may notice that I am repeating S, E, A there in both of those. You can actually write this same expression this way as well; with the S, E, A first, and then our assertion this time is not for seashore, but just for shore. What we are essentially saying is, if you find a match for S, E, A, now look ahead at what comes next, and see if you have a match for shore. See if that's what follows. Both of these would return the exact same thing: a match on just the characters S, E, A, and only when it's inside seashore, not seaside.

We'll talk about why you might choose one example over the other in the next movie. But for now, let's try some examples, so you can get the hang of it. To start out with, let's put in our seashore and seaside here, and then for our regular expression, let's first put in our first one. If you just do S, E, A, we see what it matches there, and it matches it in both of them. What we want to do now is put that lookahead assertion, and say, only match it if you can first much seashore. So you can see that it does that, and it only finds it in the first word, not in the second word.

And as I said, that's the same thing; I'll just cut, and paste that at the end. Take out the sea. It's the exact same thing as if we do it that way. Now, when I contrast that, though, against using a non-capturing expression -- and I want you to see the difference there; It's not the same thing. we are talking about what gets matched, not what gets captured. I want to make sure you understand that difference. Before, we were talking about capturing, so don't mistake the two. See, here the match is actually for seashore, and we've told it that shore is a non-capturing group.

Here we've said, don't match shore at all; just look ahead, and see if it's there. It's an important difference. It's also the same thing; don't mistakenly think that this somehow is equivalent. Here we're capturing sea. We want to capture that for use later in a Find and Replace, or something like that, but we are still matching the entire word: seashore. Let's try a more complex example. I am going to open up this text here. This is Ralph Waldo Emerson's Self- Reliance; just a text I can copy. This is in the exercise files, and then let's just paste it in here. And what I want to do is I want to find all words that are followed by a comma.

So we know how to do the basics of that. Let's say we are going to find slash b, for our word boundary. We are going to need a word boundary at the end, and in between those, we are going to have the characters A to Z, a to z, and let's put in apostrophe as well, so you get words that have apostrophes, and repeated, so there we are. Now we've got all the words. What we want is the words that end in a comma. Well, you could put a comma at the end of this, and find those words, but we've matched both the word, and the comma. What we want instead is to use a lookahead assertion to say, look ahead for that comma, but don't actually match it.

Do you see the difference there? Try it out a few times back and forth if you need to, 'til you get the hang of it. The difference is that we are asserting that the comma ought to be present, but we are not making it part of our match. And again, you can compare that against the non-capturing parentheses, and see that that does include it in the match. What we are interested is in using the assertion to not match it. So, hopefully you're starting to see what a powerful tool lookaround assertions can be. They allow you to look around the area that you're matching to satisfy certain conditions, and that's a powerful tool.

We'll learn about another powerful way that we can make use of them in the next movie.

Show transcript

This video is part of

Image for Using Regular Expressions
Using Regular Expressions

59 video lessons · 11687 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 26s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 30s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 38s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 46s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 31s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 10s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.