Using Regular Expressions
Illustration by Mark Todd

Using Regular Expressions

with Kevin Skoglund

Video: Metacharacters inside character sets

In this movie we're going to talk about how you handle metacharacters inside character sets. As a general rule, you don't need to do anything: a metacharacter inside a character set is already escaped; you do not need to escape them again. They no longer have their metacharacter meaning. So for example, if we were looking for H and then a character set that had abc.xyz followed by a literal T, that would match hat and h.t, but not hot, because the dot that's inside the character set is not a wildcard. It does not have its metacharacter meaning; it is a literal dot. See where that's true? Now there are some exceptions to this, and they make sense, that the metacharacters that have to do with character sets: the closing square bracket, the dash or range character, the caret or the negation character, and then the backslash, which of course we're using to do escaping.
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 27s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 31s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 39s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 47s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 32s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 11s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start your free trial now, and begin learning software, business and creative skills—anytime, anywhere—with video instruction from recognized industry experts.

Start Your Free Trial Now
please wait ...
Watch the Online Video Course Using Regular Expressions
5h 36m Intermediate Nov 21, 2011

Viewers: in countries Watching now:

Learn how to find and manipulate text quickly and easily using regular expressions. Author Kevin Skoglund covers the basic syntax of regular expressions, shows how to create flexible matching patterns, and demonstrates how the regular expression engine parses text to find matches. The course also covers referring back to previous matches with backreferences and creating complex matching patterns with lookaround assertions, and explores the most common applications of regular expressions.

Topics include:
  • Creating flexible patterns using character sets
  • Achieving efficiency when using repetition
  • Understanding different types of search strategies
  • Writing logical and efficient alternations
  • Capturing groups and reusing them with backreferences
  • Developing complex patterns with lookaround assertions
  • Working with Unicode and multibyte characters
  • Matching email addresses, URLs, dates, HTML tags, and credit card numbers
  • Using search and replace to format a document
Subject:
Developer
Software:
Regular Expressions
Author:
Kevin Skoglund

Metacharacters inside character sets

In this movie we're going to talk about how you handle metacharacters inside character sets. As a general rule, you don't need to do anything: a metacharacter inside a character set is already escaped; you do not need to escape them again. They no longer have their metacharacter meaning. So for example, if we were looking for H and then a character set that had abc.xyz followed by a literal T, that would match hat and h.t, but not hot, because the dot that's inside the character set is not a wildcard. It does not have its metacharacter meaning; it is a literal dot. See where that's true? Now there are some exceptions to this, and they make sense, that the metacharacters that have to do with character sets: the closing square bracket, the dash or range character, the caret or the negation character, and then the backslash, which of course we're using to do escaping.

These characters typically do need to be escaped. The opening square bracket, usually you don't need to. The reason why you need closing one is because if you think about it, imagine in that example up there, if instead of the dot, imagine if I had a square bracket. Well, then it would say, oh the character set is defined by ABC, because I've hit another square bracket. If we escape it then it say, ah! This is a literal square bracket, not the end of my character set, and it keeps on going until it gets to the actual square bracket that closes out the set. Let's take a look at some examples.

So for example, let's say we were looking for var and then a number in either parentheses or inside square brackets. So we've got three character sets that we're looking for. Let me highlight them so it's a little clearer. The first character set there is in orange, the next one is in black, and the next one is in orange. Notice in the second character set I escaped the square bracket. And if you think about it, stop and look at it, you'll see that character set would have said, oh, open square bracket followed immediately by close square bracket, I guess I'm done. So we have to say no, no, no, we want a literal square bracket and then keep going from there. I didn't need to do that in the first one.

Now you may not always have to escape these; each regular expression engine handles it just a little bit differently. So for example, in a lot of engines if we were looking for a date where the separator between the year, the month, and the date was either a hyphen or forward slash, a lot of regular expression engines would say, oh that's not the range character. I know that's not the range character, because there's nothing in front of it, so it must be a literal hyphen, so you may not need to escape it. However, if we had a number in front of it, let's say we were looking for file and then we had 0, hyphen, backslash, or underscore as the character set, well then we would need to, most likely, because it would say oh 0 all the way up to the backslash. Let me try and give you all of those characters in that range.

Okay, so typically in that case you would need to. Let's try these out. So to start with, let's just enter our regular expression. It's kind of a silly one, but it makes the point of XYZ, and T, and let's see. It matches hat, it does not match hot, but it does match h.t. See, so it's automatically escaped. There is no need for us to put this in there to escape it again. It's not the end of the world if you do; it actually doesn't ruin anything, because it just escapes something that's already escaped, but there's no need to do it. Let's try another one. Let's say we're looking for var3 or var4, right--it's going to be one of those two formats.

So we're looking for var, and then we know we'll need a character set--we know we're going to have 0-9--and then we know we're going to need another character set. That gives us a place to sort of fill this in. So let's say, first of all, let's put the round parenthesis in there first. Okay, so now we've match the first one. Now let's also include in the character set the second one, and let's put this right here. Now in this case it did match it. If I put it here, now notice it did not match it. So you can see how it handled it just a little bit differently. It said oh, well if it's the second character then it probably isn't closing up the character set, but if it's down here, well then I don't know the difference and I can't tell.

So you can see they're all just ever so slightly different; instead we need to escape it like that. That's the proper way to handle it so that it's really unambiguous, and all regex engines should handle it the same way in that case. Let's try our last example where we had a file--actually let's go down here first and let's put in file01 and file-1, file\1, and file_1. All right, so what we want to do is write a regular expression that will help us to find this. So what could those possibilities be? Well, there is a 0, dash, backslash, and underscore.

So notice it didn't find the one that's here, because it thinks this is a range, so it didn't find it okay. Everything else sort of worked out, but we need to escape that first. Now let's escape that. Now notice another one dropped off. We've now correctly defined this one, but now it thinks this one is an escape, that we are escaping our underscore. That's not what we mean, so we need to escape that as well. And now we've escaped the two characters that matter and it finds all four of them. So again, as a general rule, you don't need to escape any metacharacters except the metacharacters that have to do with character sets, and those you typically should do, just to make sure that it's unambiguous.

There are currently no FAQs about Using Regular Expressions.

 
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ .

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

* Estimated file size

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member ?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferences from the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Learn more, save more. Upgrade today!

Get our Annual Premium Membership at our best savings yet.

Upgrade to our Annual Premium Membership today and get even more value from your lynda.com subscription:

“In a way, I feel like you are rooting for me. Like you are really invested in my experience, and want me to get as much out of these courses as possible this is the best place to start on your journey to learning new material.”— Nadine H.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.