Viewers: in countries Watching now:
Learn how to find and manipulate text quickly and easily using regular expressions. Author Kevin Skoglund covers the basic syntax of regular expressions, shows how to create flexible matching patterns, and demonstrates how the regular expression engine parses text to find matches. The course also covers referring back to previous matches with backreferences and creating complex matching patterns with lookaround assertions, and explores the most common applications of regular expressions.
Back in Chapter 1, when we talked about the history of UNIX, we talked about the POSIX standardization that took place, and part of the POSIX standard was to come up with bracket expressions that would help define sets of characters. They are very similar to the character sets and the shorthand that we've been working with, but they do work a little bit different. First of all, they look very different. They all have square brackets around the outside, then colon next to that, followed by a keyword in the middle. So you can see, for example, that alpha is the keyword for all alphabetic characters. Now, it's a little bit more typing than our shorthand, but you can see how it's useful to get only the letters A through Z, whereas our shorthand, \w, included 0 through 9 and the underscore in that.
We can also specify upper- and lowercase letters. You can see there are keywords for lower and upper. That can be very handy as well. And then there's some for things like pick out all of the printable characters. That can be very useful before we print something; just find the printable characters in this document--characters and the spaces-- or the graphic characters--the things that actually took ink to write. Those are the graphic characters, not the spaces. Or to find all of the control characters, that kind of thing. So it's very commonsense approach. The keywords are easy to read and not hard to memorize, and they do give us a little more specificity than we had by using the other shorthands.
They do work a little bit differently though. When we use these, we don't use them stand-alone; we use them inside a character class. So you would put two sets of square brackets, or we could negate it by putting the negative in front of it. They have to go inside a character class. Very important. The incorrect way is to do it just standing on its own. If you look at that for a second, think about how the regular expression engine sees that when it comes across it.
It comes across it--it says, ah! Open square brackets, this is a character set. Everything that's in here must be part of a character set, and so it treats it like a character set. It doesn't recognize that it's a POSIX bracket expression. So, in general, if you're using POSIX expression, I think it's a really good idea not to mix the POSIX sets with other shorthand sets. Just use one or the other. Pick one and stick with it. If one is not meeting your needs then switch over to the other, but don't mix them, because you can run into problems that way. Now, as far as support goes, you can use them in Perl, PHP, Ruby, and in UNIX, because remember, POSIX was a standardization effort on UNIX, so BREs and EREs should all support the POSIX standard.
I'm going to tell it I want to find everything that has an S, followed by a character set, and inside that character set is going to be my POSIX expression. Colon, colon, and let's tell it to find everything that is a digit. There we go! It will go through all of my running processes and look for anything that's an S followed by a digit, and there it is. There are the things that it found and that it brought up for us to look at. Now, it does not work of course if we take this away, because now, it's going to find everything that has an S followed by one character, which is either a colon, a D, an I, a G, an I, or a T. It looks that as a literal character set.
So you'll see we get back a whole lot more stuff there. So that illustrates the point of how these POSIX expressions work. If you find that the regular character sets are not able to narrow it down to what you need, these can be a handy way to do it. Now, I think you'll find that a lot of the times either writing your own regular expression or using the other shorthands will do the trick for you, but it is nice to have this tool in your toolbox in case you do need to be a little more specific.
There are currently no FAQs about Using Regular Expressions.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.