Viewers: in countries Watching now:
Learn how to find and manipulate text quickly and easily using regular expressions. Author Kevin Skoglund covers the basic syntax of regular expressions, shows how to create flexible matching patterns, and demonstrates how the regular expression engine parses text to find matches. The course also covers referring back to previous matches with backreferences and creating complex matching patterns with lookaround assertions, and explores the most common applications of regular expressions.
In this movie, we'll learn to write regular expressions to match passwords. Now, I'm not talking about matching passwords to make sure a user has given us the right password to gain access. That's the kind of task that can be left up to a Web server, or a Web application, or to a database. What I'm talking about is matching a password against a regular expression to make sure that it meets a set of password requirements, to ensure that the password is a secure password, and that we want allow a user to use it. So let's look at what some of those requirements might be. For example, let's require that our user enter a password that contains any character, except a space.
Let ensure that it's at least 8 characters long, but not more than 15 characters long, that it includes at least one uppercase letter, one lowercase letter, one numeric digit, and one symbol. These steps are going to help to make sure that our user gives us a secure password that's harder for a hacker to crack into. So let's try to writing a regular expression that will make sure our password meets all of these requirements. Let's start with an example password. We're just going to have swordfish. That's a simple password, all lowercase, and it's a dictionary word.
That makes it very easy for hacker to guess it, or to write a computer program that can guess it. Instead, we want to force our user to come up with a password that's a little bit better than that by using a regular expression. We're going to start by turning on multi-line anchors, and let's put in our start and end anchors here, because it's very important that we match the whole string from start to finish with our regular expression. To match it to begin with, let's just use any character, one or more times. That's the simplest possible thing. Any character, one or more times, and now we have a match, but there's a problem with it.
If we put a space in here, it still matches, and one of our requirements was that it can't include a space. So instead of saying it matches any character, let's now say it matches any non-space character, and that backslash, capital S is the shorthand for allowing us to do that. That's anything that is not a space, a tab, or a line return. So now, put a space in there, and it no longer works. Our next two requirements for the password is that it's between 8 and 15 characters long. Well, we know how to do that by using quantified repetition. We just say that it has to be between 8 and 15 characters, and now if we start typing a lot, it stops working, and if don't enough, it also doesn't work.
It has to be between 8 and 15 characters. That's the basics of just getting a match, but that doesn't take care of our other requirements; the things that are going to make sure that we use something besides a basic dictionary word. So our next four requirements are that the string must contain an uppercase letter, a lowercase letter, one digit, and one symbol. So how can we specify those in our regular expression? Well, to do that, we'll need to use lookahead assertions. Go back and look at that chapter if you need a refresher on how they work. What we'll do is put an assertion here at the start, before we get to the main regular expression. Now, it's going to be a group that has a special purpose.
The question mark is what tells us that it has the special purpose, and the equal sign indicates that it's special purpose is that it's a positive lookahead. In order for this expression to match, then our assertion that's inside this group must be true. Now, our assertion is that there is somewhere a capital letter A to Z, and that it can be anywhere in the string. So what we're going to say is that there can be any character, zero or more times. So that way it can be the very first character; that's fine, or there can be some other characters, but somewhere in that string, we must match the capital letters A to Z.
Now, notice that I'm using the wildcard here. That's okay, because I have a second expression here that's going to make sure that it's not a space character. So here it's okay to just use the wildcard. It's going to be running both of these regular expressions. It first runs the assertion to make sure that the assertion is true, and if it's true, then it rewinds and checks the second regular expression. Now, notice it doesn't pass anymore, so we've got to change this. Now we have a capital letter in there, and now it does match. Let's do the lowercase letter as well. I'm just going to copy this, and I'll paste it here at the end. We'll make it a to z.
Of course, we already had lowercase letters, so it still passes. But notice what happens here. It runs two assertions. It first checks the first assertion, rewinds, and then checks the second assertion, and then rewinds again, and checks our main regular expression. Now, what you don't want to do by accident is this; it's not the same thing at all. This is an assertion that there's at least one character somewhere in the string that is either an uppercase or lowercase letter. That's not the same thing. What we want is an assertion that there's at least one uppercase, and an assertion that there is at least one lowercase.
For the digit, we can do the same thing. Let's copy this, and I'm going to put it here at the beginning. I am going to paste it in, and instead of saying it's a character a to z, we'll use our digit character. Now, notice that I put it before the other ones. Here's a tip for you: the assertion that you think is most likely to fail is often the best one to put first. It doesn't have to be, but it can be slightly more efficient if we can fail early. It's very likely that the password has lowercase letters, so why spend time doing that calculation first? Instead, do the one that may fail, and return the result faster, and save yourself the processing time of looking for the lowercase letter.
So when possible, fail early. So now I've got to modify this. I'll put in 42, and now it matches. And then the last requirement is that we put in at least one symbol. Now, we could fail early, and put it at the very beginning. I'm actually going to make it the second one. We will first do of a digit check, and then after that digit check, we're going to check for a symbol, and for the symbol I'm going to use a character set. And inside that character set, I'm going to list off all the characters that they can use, and I'm just going to bang on my keyboard here. Let's just go cross the top row, and pick all of those symbols.
Now, when I get to the underscore, and then I've got dash, and when you get to the dash, you need to make sure that you escape that, because it has a special meaning inside a character set. We've got plus, and equals. I've got the pipe, and the backslash. Again, backslash has a special meaning, so I need two of them. Let's use our curly braces; square braces, but square braces also have a special meaning, so let's make sure we go in front of those and put a backslash. I'm not going to allow the quote characters, but let's do the colon, and the semicolon, the two arrows, and the question mark, and the forward slash.
There are currently no FAQs about Using Regular Expressions.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.