Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Defining a character set

From: Using Regular Expressions

Video: Defining a character set

In the last chapter we learned about matching single characters and we also saw our first metacharacter, the wildcard. In this chapter, we'll talk about character sets. In a way, the wildcard character is a character set too; it's just a character set that matches all characters, or it's a character set of all characters. As we saw, that results in really broad matches. What we want to do instead is to narrow our expression so that it matches less. Remember, the two tricks to regular expression is in matching what you want, but also in matching only what you want, so we need the ability to be more specific about what should match so that it doesn't just match everything indiscriminately, and the way we're going to that is by learning a couple more metacharacters that will help us to define a character set, which are the open and closed square braces.

Defining a character set

In the last chapter we learned about matching single characters and we also saw our first metacharacter, the wildcard. In this chapter, we'll talk about character sets. In a way, the wildcard character is a character set too; it's just a character set that matches all characters, or it's a character set of all characters. As we saw, that results in really broad matches. What we want to do instead is to narrow our expression so that it matches less. Remember, the two tricks to regular expression is in matching what you want, but also in matching only what you want, so we need the ability to be more specific about what should match so that it doesn't just match everything indiscriminately, and the way we're going to that is by learning a couple more metacharacters that will help us to define a character set, which are the open and closed square braces.

These square braces indicate a character set which will match any one of several characters, the characters that are inside the set. But it's very important, it will match only one character. The orders of the character inside the set do not matter; it's just about these are the items that can match. So for example if we have A, E, I, O, and U, that will match any one vowel. That's it. Let's say we have it inside a word, like gr, and then in square brackets, eay. That will match a literal g and r and a literal y, and there can be one letter in between, and what can that letter be? Well our character set tells us it's either an e or an a, so that will match grey with an e or gray with an a.

Now notice that great does not match the word great. Don't be thrown off by that; it's a single character. This will still match a four-letter word: gr something, followed by a t, and that something has to either be an e or an a. Now these brackets are going to be a really big source of power for regular expressions, because we can be very specific about what should be allowed in that spot instead of just having that big open wildcard. Let's take a quick look at how the regular expression engine parses this kind of regular expression.

So once again, we have our sentence "The cow, camel and cat communicated," but this time our regular expression is not C-A-T, it's C followed by character set that can be A-E-I-O-U, and then a literal T after it. So of course it starts at the beginning, and you can assume it will move along character by character through there-- we've already talked about how that works-- until that finally gets to this C. When it gets to this C in camel, it matches, and says okay, I've got a literal C there. Let's move to the next character. Is this character in that character set? Is it one of the characters that's been defined? Yes, it is, so now it moves forward to the next character and says is that the literal T that comes after the character set? It's not, so then it backtracks to the A and now it says, all right, is that the C that I'm looking for? It's not, so it keeps moving along, and it works its way down till it finally gets to word cat, and then it finally makes the matches. It says ah! Here is an A that's inside the set. Here's the literal T. Now I have a match.

Then of course it do the same thing as it moved through communicated. So the process of the match is still the same thing; it's just now that we have this character set, it's going to use the set to see if something should match instead of a literal character. Let's try a few out. So let's just try our examples there. Let's say we had A, E, I, O, and U, so we're going to match any one character. And I'm going to try bananas and peaches and apples. Now notice here that it matched the A, the A, and the A. That's because I have Global turned on, right, so it matches all of them. And then in Peaches, notice that the EA, it matched two times.

The colors let you know that it's actually two different matches. The E matched and the A matched. It's not matching E and A together; it's only one character. Notice also that this A here is not matched--the capital A in Apples. This is case sensitive unless we checked that. Same thing is true inside character sets. A now it suddenly does match A, E, I, O, U would match it regardless of whether it's uppercase or lowercase. All right, let's try with gray, G-R-E-Y and G-R-A-Y, and let's change our match here so that now we're going to match for gr and y on the outside, and in here we're going to look for E or A, right.

So now I can match anything that is E or A. It doesn't matter if we have more things in there. B, C, D, right, it doesn't make any difference. Notice it also doesn't make a difference what order they are in. If I have A or E, that doesn't make any difference either. Another one, let's try with great. That was the other example we had. So if we have great, and we have, let's put great here, notice that it does not match. If we want it to match, we would have to have another character here. We could to it that way. Now it does match: GR followed by one character which is an E or an A, followed by another characters that's an E or an A, followed by a T. See how that works? Be careful though. What this also does match of course, is graet, greet and graat, right? See why that's true? Those are the different combinations that we can come up with by doing that.

So just be careful about when you build this to think about the things that it might also match the different combinations that you might be able to come up with. Let's try one last one. Let's just use a string. We'll just say Hello, and then up here let's type in--have something that will match that. To match the capital letter that might be at the beginning of the word, we're going to match it with ABCDEFGHIJKLMNOPQRSTUVWXYZ--wow! There we go-- followed by some string, which is in this case is Hello. Wow! That was a lot of typing to get all of these uppercase letters, right? If I wanted to do the same thing for this letter, I'd had to have to do it all over again with lowercase letters.

Fortunately there's a much simpler way to do that, and we could do that with character ranges inside our character sets. We'll take a look at how to do that in the next movie.

Show transcript

This video is part of

Image for Using Regular Expressions
Using Regular Expressions

59 video lessons · 11675 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 26s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 30s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 38s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 46s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 31s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 10s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.