Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Lookbehind assertions

From: Using Regular Expressions

Video: Lookbehind assertions

In this movie, we're going to be learning about lookbehind assertions, and the metacharacters that we'll use for those are question mark, less than sign, equals for a positive lookbehind assertion, and question mark, less than, exclamation point for a negative lookbehind assertion. A lookbehind assertion is an assertion of what ought to be behind. It's very similar to what we had with lookahead assertions. If the lookbehind expression fails, then the match will fail. Any valid regular expression can be used inside there, and just like the lookahead assertions, they are going to be zero-width; they don't include the group in the match.

Lookbehind assertions

In this movie, we're going to be learning about lookbehind assertions, and the metacharacters that we'll use for those are question mark, less than sign, equals for a positive lookbehind assertion, and question mark, less than, exclamation point for a negative lookbehind assertion. A lookbehind assertion is an assertion of what ought to be behind. It's very similar to what we had with lookahead assertions. If the lookbehind expression fails, then the match will fail. Any valid regular expression can be used inside there, and just like the lookahead assertions, they are going to be zero-width; they don't include the group in the match.

It's just an assertion about what ought to be true. The syntax is also very similar. You have, inside a group's expression, the very first characters will be question mark, less than, and the equal sign, followed by your regular expression. Or, question mark, less than, exclamation point, followed by your regular expression. Don't let the fact that we now have three symbols throw you. They are the same as the lookahead assertions, but with a less than sign tossed in the middle. You should think of it as an arrow that's pointing backwards. So the question mark indicates a change to the group's meaning, the arrow or less than sign tells us to look backwards, and then the equal sign says that it's a positive assertion, or the exclamation point for a negative assertion.

The way it would actually look in practice would be something like this. We're going to look behind us for the word base, and match the word ball when that's true. So it will match the word ball in baseball, but not in football. The way that the regex engine actually parses this is that as it's going through the word baseball, on every single letter, it stops, and looks backwards to see if it finds the word base. If it doesn't, then it keeps moving. And once that condition is satisfied -- that's the point at which we get right between the e and the b in baseball -- it looks backwards, and it says, okay, I see base now.

Now this is true, therefore, now I'll check the next part of the expression, and see if I have ball, and it does, and so we have a match on the word ball. Now, like the lookahead assertions, you can flip it around, and put the lookbehind assertion at the end. However, typically you don't do this. Most of the time you put it at the front, just because of the efficiency of it; of just moving backwards to that text that you've already matched the first time. Typically, it would either match the text the first time, or use a lookahead assertion to make sure that it match the second pattern instead of backtracking through it a second time with a lookbehind assertion.

There are some very rare cases where this could be useful, though. And of course, we can have the negative version of that if we had a lookbehind negative for base, followed by ball. That would match ball in football, but not in baseball. Now, there is one important difference about lookbehind assertions from lookahead assertions, and that is the support for them. There is support for simple expressions in .NET, Java, Perl, PHP, Python, and Ruby 1.9. But lookbehind assertions are not supported in JavaScript, Ruby version 1.8, and in all those UNIX tools that didn't support lookahead assertions anyway.

Now, when I say simple expressions, what I mean is fixed length. If you think about it, everything in regular expressions so far has been about moving forward through strings, or rewinding. Well now we are talking about doing something different. We're talking about putting it in reverse, and going the opposite way looking for a string. That adds a whole new layer of complexity. It may seem like, oh, it's just simply going the other way, but it's kind of like driving a car. Everything about the car is built to go forwards. Now, you can go backwards, but it's a little bit harder to do, because the car is mostly designed to go forwards.

So it requires more effort on the part of the driver, and you typically have to go slower if you are going to go on reverse. It's the same thing with the lookbehind assertions. We can use literal text, we can use character classes, because they represent a fixed length; just a single character. But we typically can't use repetition, or optional expressions, because those are not fixed length. We saw that when we were first learning about repetition, that there's a lot of inefficiency, and a lot of moving and rewinding that goes on when we start having repetition. We can use alternation, but only with fixed length items, and that's for the exact same reason.

So, for example, we can backtrack, and find whether something is preceded by cat, dog, or rat, but you cannot check and see whether it has apple, banana, or plum, because each of those has a different length to it. Now, there are two notable exceptions here, which is that Java does allow you to use repetition and optional items, and .NET allows repetition, and optional expressions, as well as alternation with non-fixed length items. But as a general rule, your lookahead assertions can be very complex, but you should make your lookbehind assertions very simple.

Let's try a few simple examples. Now, we won't be able to use our RegexPal tool, because that's JavaScript, and it's not supported in JavaScript. But just so you can see how they work, I am going to show you some examples by using TextMate. So here I am inside a TextMate document, and I am just going to type I like baseball and football. Alright. Now, in order to test these, I am going to use Find, which is Command+F, or you can select it from the Edit menu. And for my regular expression here, I am just going to put in ball to start with, and let's just do a Find. Let me do Wrap around. There it is, now it'll wrap around, and now I can see each of them.

Baseball and football each match. Now let's put our lookbehind, and let's say, only when it is base in front of it. So now it matches ball, and that's the only one that it matches. If I hit Command+G to Find again, it only finds one of them. Same thing, of course, if we put the not in front of it, now it finds football, and only football; not baseball anymore. Let's try another example. Let's say I have some names here. Let's have Benny, Benjamin, Jenny, and Lenny, and let's do a Find, and let's first just find J, A, M, I, N, or N, Y.

I am using an or operator there. So what I am essentially doing is trying to find the thing that comes after that first part of the name. So it finds all of those. Let me open up Find again, and this time, let's say our lookbehind, and let's say we want to look behind it only if you find Ben in front of it. So now it finds the N, Y in Benny, and the J, A, M, I, N that comes after Ben, but it does not find Jenny or Lenny. Now, we can use alternation, again, fixed length here.

Now it finds it for Benjamin, for Jenny, and for Benny, and of course we have the negative version of that, which would find it for Lenny, but not for the other two. So again, the concept is very similar to what we had for lookahead assertions. The only big difference is that it's not as widely supported, and we need to keep our expressions simple.

Show transcript

This video is part of

Image for Using Regular Expressions
Using Regular Expressions

59 video lessons · 11782 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 26s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 30s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 38s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 46s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 31s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 10s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ .

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member ?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferences from the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.