Using Regular Expressions
Illustration by Mark Todd

The power of positions


From:

Using Regular Expressions

with Kevin Skoglund

Video: The power of positions

We've covered the fundamentals of lookaround assertions, and we've seen that essentially they allow testing of a regular expression apart from matching. This simple fact gives us a lot of functionality; some very cool features that we can use. We've seen that we can peek forwards and backwards to see whether something is true. We can match a string by using multiple expressions. We saw how we can define rejection expressions; things that should be excluded from our match, and we even saw that we could find the last occurrence of something. Now I want us to talk about a subtle but powerful aspect that we haven't really covered in depth yet, and that is the power of positions.
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 27s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 31s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 39s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 47s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 32s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 11s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start your free trial now, and begin learning software, business and creative skills—anytime, anywhere—with video instruction from recognized industry experts.

Start Your Free Trial Now
please wait ...
Watch the Online Video Course Using Regular Expressions
5h 36m Intermediate Nov 21, 2011

Viewers: in countries Watching now:

Learn how to find and manipulate text quickly and easily using regular expressions. Author Kevin Skoglund covers the basic syntax of regular expressions, shows how to create flexible matching patterns, and demonstrates how the regular expression engine parses text to find matches. The course also covers referring back to previous matches with backreferences and creating complex matching patterns with lookaround assertions, and explores the most common applications of regular expressions.

Topics include:
  • Creating flexible patterns using character sets
  • Achieving efficiency when using repetition
  • Understanding different types of search strategies
  • Writing logical and efficient alternations
  • Capturing groups and reusing them with backreferences
  • Developing complex patterns with lookaround assertions
  • Working with Unicode and multibyte characters
  • Matching email addresses, URLs, dates, HTML tags, and credit card numbers
  • Using search and replace to format a document
Subject:
Developer
Software:
Regular Expressions
Author:
Kevin Skoglund

The power of positions

We've covered the fundamentals of lookaround assertions, and we've seen that essentially they allow testing of a regular expression apart from matching. This simple fact gives us a lot of functionality; some very cool features that we can use. We've seen that we can peek forwards and backwards to see whether something is true. We can match a string by using multiple expressions. We saw how we can define rejection expressions; things that should be excluded from our match, and we even saw that we could find the last occurrence of something. Now I want us to talk about a subtle but powerful aspect that we haven't really covered in depth yet, and that is the power of positions.

If you'll remember, all of these assertions are zero-width. Hopefully I've drilled that fact into your head by now, and zero-width means zero position movement. Here is what I mean by that. Let's say we have our original example, where we have a positive lookahead assertion for seashore, followed by sea. That would match sea, and seashore. Now remember, what happens is the regular expression engine first matches our assertion, and when it's done, it rewinds back to the beginning. So at that point, it has no position movement; the position has not changed at that moment when the expression ends.

Let me give you another example. Here, we are going to use a lookbehind assertion. What I am going to do is I am going to have 54.00, and dollar sign 54.00. Those are my two bits of text. What I have is a lookbehind assertion that looks behind, and makes sure that I don't have a dollar sign or a digit, and then I do have some digits, followed by a decimal, and two more digits. So that will match 54.00, but not when it has that dollar sign in front of it. That negative lookbehind makes sure that we don't match the one that has the dollar sign there.

We included the digit with that dollar sign to make sure that it also doesn't match 4.00. It goes ahead and takes any digits that it can. All right! Now imagine for a moment, what if we took that second expression -- the backslash, D, plus, period, backslash, D, backslash, D. What if we took that part, and put that inside a positive lookahead assertion, like this? At the end of that, what gets matched? We have two assertions; a lookbehind assertion, and a lookforward assertion. Nothing gets matched, because neither one of those include that assertion in the match.

But the regex engine does find a match; it does succeed and say, ah! Both of these assertions are true, but I have a zero-width match. So it matches, but the final match is zero-width. More importantly, though, where is the regular expression engine pointer at the end of this, after it makes the match? It rewinds once it's done its backwards looking; it rewinds once it's done its forwards looking. So at that point, the regex engine's pointer is sitting right in front of the 5 for 54.00. Do you see that? Now, why does it matter that we've matched zero-width, and the pointer is now sitting at that place, and the regex engine thinks that it's accomplished its job? Well, as I said, it's a subtle point, but it's very powerful, because this is very useful for inserting text by using Find and Replace.

We've located a position, the character that we are going to replace is zero-width, but we are going to place it with something that does have some width. So essentially, that's the same thing as inserting. Going to a position, not selecting any width, but putting something in that place. Pretty cool, huh? Let's try it out. And because we are going to be using Find and Replace, I am going to be showing this to you in TextMate. Let's say that we've got a simple sentence; This costs 53.00, or 54.00 with a dollar sign in front of it. I am going to open up my Find.

I'll do that with Command+F, and for my regular expression here, let's put in a negative lookbehind. So not anything in the character set, dollar sign, or digit. And then once that lookahead is done, let's just look for some digits, backslash, decimal, backslash, D, backslash, D. So it matches 53.00, but it does not match 54.00. So that one is just a full regular expression. we've seen that before. Now what we want to do is turn that into a positive lookahead assertion.

So now, we're just asserting that this ought to be true. Notice where my cursor went? Let me move it, just so you see it again. It jumps right in front of the 53. So now, if we say Replace All, boom! Look at that. It just dropped in our dollar sign right in front of the five. Let's try another example. I am going to put in a sentence here. An astronomical unit is, this very long number of kilometers; approximately the average distance between the Sun and Earth. What we want to do is add commas to delimit this number.

So after every three digits, there should be a comma. So how can we accomplish that? Let's go here to our regular expression. We know we are going to want to replace it with a comma. Our Find, though; we are going to want to find three digits in a row. We are going to group those together, and repeat those, and that should then find sets of three digits. Ah! But, look at that; it found the whole entire thing. What we want to do is tell it to look behind, and make sure that there is a digit in front of it. No reason to delimit the first three. We only want to delimit it if there's a digit in front of it. At the end, we're going to have a negative lookahead assertion for something being a digit at the end.

So now we are saying, all right; find the sets of digits that have a number in front of them, and don't have a number after them, and in sets of three. Let's take that whole thing, and now let's turn this expression right here -- this middle part -- into an assertion as well. So now it will have zero-width, and actually we want to include this in the assertion as well; that it has no digit after it, and now let's do a Find. There it is! So it found the spot between the first three digits, and the spot between the second three. If I just keep going back and forth, you see that's the two spots that it found, and so now if I do my insertion in each of those spots -- let's do Find, and let's do Replace All -- now we get our comma delimited number.

So as I said, this is a very powerful behavior that arises from a very subtle aspect of the way that the zero-width assertions work.

There are currently no FAQs about Using Regular Expressions.

 
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ .

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

* Estimated file size

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member ?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferences from the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Learn more, save more. Upgrade today!

Get our Annual Premium Membership at our best savings yet.

Upgrade to our Annual Premium Membership today and get even more value from your lynda.com subscription:

“In a way, I feel like you are rooting for me. Like you are really invested in my experience, and want me to get as much out of these courses as possible this is the best place to start on your journey to learning new material.”— Nadine H.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.