Start learning with our library of video tutorials taught by experts. Get started
Viewers: in countries Watching now:
Unix for Mac OS X Users unlocks the powerful capabilities of Unix that underlie Mac OS X, teaching how to use command-line syntax to perform common tasks such as file management, data entry, and text manipulation. The course teaches Unix from the ground up, starting with the basics of the command line and graduating to powerful, advanced tools like grep, sed, and xargs. The course shows how to enter commands in Terminal to create, move, copy, and delete files and folders; change file ownership and permissions; view and stop command and application processes; find and edit data within files; and use command-line shortcuts to speed up workflow. Exercise files accompany the course.
So we've already gotten a look at the wildcard, the period, and we've also seen character sets when we were working with finding both peach and pineapple. Let's try the beginning Of line and end Of line anchors. So for example grep and the beginning of line anchor followed by P inside fruit will find every line that begins with P. Notice it did not match apple. Apple has a P in it, but it's not at the beginning of the line. I am going to do the same thing. We can find everything with berry at the end and that will return every line that has berry at the end.
Now in this case actually all occurrences of berry were also at the end of the line, but it would find it if it were only at the end of the line. Let me show you that. One way that you can work with grep that's really useful is instead of working with a file, let's say we have berry bush, we can use pipes from echo, so we'll pipe-in the string, and then we'll grep for 'berry$'. Now, it didn't find it, because it's not at the end of the line. If we instead we're looking for berry bush berry, now it does find it. It matched the last one, not the first one.
You can see I have my colorization turned on, so it just colorized the last one. If we did the same thing but we take away that end Of line anchor, now it finds it both times. Let's try another one. Let's do echo, and let's do AaBbCcDdEe. There we go! And let's pipe that into grep, and this time I'm going to make sure that color is on. If you don't have your color on, this will turn your color on. Let's search for just upper. We're looking for uppercase. We're going to use that character class.
Notice what happened here. It didn't match the uppercase letter that you might have expected it to, because it's not interpreting this as being a character class. It's interpreting this as being a character set. It thinks that we want to match anything that is in this character set. So E is in this character set, so it got matched. There is no U, P, and R. That's why the lowercase E got matched. If we instead put double-brackets around this, and we actually should probably put quotes around it too. It's always a good practice, remember.
Now, it returns what we expect it to match. So I just point that out to you to make sure that you see with the single brackets we're actually referring to the class of characters. But if we wanted to actually work inside grep, we use the double ones so that it's saying a character set made up of the character class. I'm just going to paste-in another example, so you don't have to watch me type it. I've just got a bunch of punctuation here and I'm going to search that to find all the punctuation. I think that gives you the basics for regular expressions. Again, it's a very deep subject, and this is really only the surface of it.
There are even sites that catalog regular expressions. So if you're looking for a regular expression that will match every phone number or every email address, all those different combinations for how those might be formatted, those exist. People have written them and they've shared them on different web sites and you can make use of them without having to reinvent the wheel. But what I do want to show you is there are a couple of things to watch out for when using regular expressions with grep. The first of these is that if we were to grep for 'ap*le' inside *fruit.txt, that the two asterisks here have different meanings.
This is the regular expression asterisk. It means that the P is repeated zero or more times. This would match ALE, APLE, APPLE, APPLE, and so on. This one is a wildcard for the file system, totally different meaning. So I want to make sure that you see those and realize the difference. That's part of why we want to make sure that we put these inside quotes, is to help keep that separate. The second is I'll take away the asterisk here and I'll run the command and you'll see that it does return the results that we would expect. If we run this other version with the plus sign instead, that's the operator that means one or more times.
That would match APLE and APPLE, but not ALE. It has to occur at least once. That's what the plus symbol means. If we run it now, we get nothing back. The reason why is it's taking that plus to be the literal plus sign. This is what I was talking about with the basic and extended regular expressions. So it's an important sort of gotcha when working with regular expressions, that there are the basic ones that work all the time, and then there is this extended set of few things that only work in some cases. If we want to use the extended set in grep, we need to use the -E option. Now it works! Now, it finds it exactly as we would expect.
So if you want to use a few of these extra features, you're going to need to use that -E. There are more of them than just the three I showed you. Those are the most basic and the ones that come up and cause the problem most often. Could you use -E all the time? Absolutely. Another spot where this basic and extended regular expressions causes problems is when we're working with that OR operator as well. So for example, if we grep for Apple or Pear inside fruit.txt, so it feels like a very simple operator. It feels like something we ought to be able to do. It says oops, it's not there.
All we've got to do is put that -E in front of it and now it does find every occurrence of either Apple or pear. So regular expressions, especially when combined with rep, is really great for finding exactly what you want. But what it doesn't do is it doesn't allow you to change or manipulate anything. All it does is allow you to find it. So in order to make changes to manipulate the data, we're going to need to take a look at a few more Unix tools.
Find answers to the most frequently asked questions about Unix for Mac OS X Users.
Here are the FAQs that matched your search "":
Sorry, there are no matches for your search ""—to search again, type in another word or phrase and click search.
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.