Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

sed: Regular expressions and back-references

From: Unix for Mac OS X Users

Video: sed: Regular expressions and back-references

In the last movie, we got familiar with the syntax of sed, but all of our searching so far has been with literal text strings. Now we're going to learn to use regular expressions with sed. It may seem like sed is really similar to grep. That's because it is. All sed is, is grep and then substitute. So put another way, anything that you can find with grep, you can change with sed, and that includes making good use of regular expressions. So as a simple example of this, let's just have echo "Who needs vowels?" and we'll pipe that into a sed expression where we will look for anything that is in a, e, i, o, or u, inside a character set and we'll replace that with an underscore. We'll do it globally. So there you go.

sed: Regular expressions and back-references

In the last movie, we got familiar with the syntax of sed, but all of our searching so far has been with literal text strings. Now we're going to learn to use regular expressions with sed. It may seem like sed is really similar to grep. That's because it is. All sed is, is grep and then substitute. So put another way, anything that you can find with grep, you can change with sed, and that includes making good use of regular expressions. So as a simple example of this, let's just have echo "Who needs vowels?" and we'll pipe that into a sed expression where we will look for anything that is in a, e, i, o, or u, inside a character set and we'll replace that with an underscore. We'll do it globally. So there you go.

You see it took all of the vowels that were in that character set and replaced them with the underscore. So we can use regular expressions. Now, the regular expressions here work exactly like they do with grep, meaning that we also have an issue with basic versus extended regular expressions. So for example, if I put the plus sign in there, it doesn't work anymore, because the plus is part of the extended regular expression set and just like grep, we would use the -E option to be able to use those extended features. So let's try a couple.

Let's say we have for example our fruit file. That's just cat fruit.txt and in that let's writes a sed expression that will take the first line that starts with p, any line that begins with p, and we're going to replace that with space, space, p. Now, notice I had to repeat the p again. So I'm finding it. It's going to be part of what gets replaced. So I want to make sure that I still include it when I finally replace it. So there we go, and we'll just do our fruit.txt file for that.

So everything that had a p got indented two spaces. We could also leave out the p and just indent everything two spaces. A variation on that would be to, instead of having two spaces, put the right angle quote in there. That does the same thing as when we quote a mail message, right? If we reply to a mail message, our mail editor might stick those in front of the reply that we're doing. One important thing that you might run into is you might think, well, instead of spaces in front of every line, what if I wanted to put a tab? And we have this shortcut for the tab character, which is the \t. That doesn't work here.

sed doesn't understand that \t, or at least the Mac version of sed doesn't. There are other versions that do, but the Mac version doesn't understand it. So in order to get it, the trick that you need to know is that in bash if we want to type a tab character, the special tab character, the way to do it is to type Ctrl+V and then the actual character. So hit Ctrl+V and then tab, and then we'll actually get that tab effect. Ctrl+V works for other characters as well. Ctrl+V and Enter, Ctrl+V and Escape.

It will type the actual character for you. You won't need it most of the time, but this is one of those cases where it definitely comes in handy. Let me give you one last example in this before we move on. Inside the directory I am in I've added a new file called homepage.html. What that is, is just a basic homepage for a fake company. So you can use any HTML you have. I just wanted to have some HTML to work with. Let's construct a sed script that will remove all of the tags from this, all the HTML tags, so that what we're left with is just text.

Well, the way we could that is with sed, we'll use -E, capital E, so we can make use of the extended sed substitute. And we know we're going to want to find everything that's inside those tags, the angle brackets. So I'll just do that for now. And then we're going to remove them globally inside homepage.html. So now we just need to write a bit more of our regular expression. Inside those angle brackets, what are we going to have? What's allowed to be in there? Well, you could say a lot of things. I'm going to say that the thing that defines it is that it is not those angle brackets.

It could be any other character besides those, and I'm not going to be picky. And then we'll put our plus sign after it to show that there can be more than one of those. So that then takes our HTML and strips out those tags. You certainly can come up with better regular expressions certainly using more advanced techniques. You can see, for example, it didn't filter out this first tag because it's broken across two different lines. It's not perfect, but you do get the idea of what it can do for you. Now let's talk about back references. Back references are actually part of regular expressions and sed makes good use of them. Let me show you a good example.

Let's say we have echo 'daytime' and we want to change that using sed, and what we want to make is daytime, it's going to be made into daylight. We might be looking for things that are much more complicated. We might be looking for not the literal daytime, right? We might be constructing some very fancy regular expression here. We might be saying, well, look for anything that is ... time, we don't care what it is. And what we want to do is take that thing, whatever that thing was, and use it again.

We don't know what it is ahead of time. It's not necessarily day. It might be some other three letters. Well, the way that we do that is with a back reference and a back reference is the backslash and then the number of the back reference. So we can have more than one of these defined in our search string. In fact, I believe it supports up to nine, so you can have up to nine of these and they will then say "Ah, the first set of parentheses, well, that corresponds to \1. The second set of parenthesis, that corresponds to \2, and so on." Now, if we try and run this, it doesn't work, and that's because these parentheses here have to either be escaped to work with basic regular expressions or if we don't escape them, we have to say this is an extended regular expression.

So anytime we use those parenthesis in there, it has to either be extended or they have to be escaped. And just to make sure that you understand the difference here, let's say instead of day, let's say that it said something like, we'll put xxx for now. So you can see that it took those same three characters that it found. It didn't care whether they were day or something else. It took those and dropped them into the replacement string. Let me give you a real world example. I think that will make it clear why this is really useful. Let's say that I have a name like Dan Stevens.

I can pass that into a sed script that will say all right, take any characters that could be in the first name, followed by a space, followed by any characters, and then reverse them with a comma between them. Look at that. Dan Stevens suddenly become Stevens, Dan. So you can see how you can do this, not just to this little bit of input that I'm sending it, but you could do it to an entire file. Let's try something similar by using our fruit file. So I have a sed expression here that's going to look for either apple, pear, plum, or peach, and for any of those it will append tree after it.

So we get pear tree, then we get raspberry banana, then we get peach tree, apple tree, pineapple tree. So any of those that matched our regular expression got reused in our output. Now, there is a lot more that sed can do, but this shows you some of its most common uses, and I think it gives you a solid foundation for exploring further on your own.

Show transcript

This video is part of

Image for Unix for Mac OS X Users
Unix for Mac OS X Users

82 video lessons · 26221 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 3m 57s
    1. Introduction
      1m 14s
    2. Using the exercise files
      2m 43s
  2. 32m 2s
    1. What is Unix?
      7m 27s
    2. The terminal application
      4m 23s
    3. Logging in and using the command prompt
      5m 19s
    4. Command structure
      5m 22s
    5. Kernel and shells
      5m 25s
    6. Unix manual pages
      4m 6s
  3. 15m 58s
    1. The working directory
      2m 49s
    2. Listing files and directories
      3m 59s
    3. Moving around the filesystem
      4m 58s
    4. Filesystem organization
      4m 12s
  4. 1h 4m
    1. Naming files
      5m 41s
    2. Creating files
      2m 19s
    3. Unix text editors
      6m 39s
    4. Reading files
      5m 35s
    5. Reading portions of files
      3m 27s
    6. Creating directories
      2m 40s
    7. Moving and renaming files and directories
      8m 32s
    8. Copying files and directories
      3m 7s
    9. Deleting files and directories
      3m 38s
    10. Finder aliases in Unix
      4m 10s
    11. Hard links
      5m 30s
    12. Symbolic links
      6m 36s
    13. Searching for files and directories
      6m 32s
  5. 34m 58s
    1. Who am I?
      4m 3s
    2. Unix groups
      1m 52s
    3. File and directory ownership
      6m 41s
    4. File and directory permissions
      4m 27s
    5. Setting permissions using alpha notation
      6m 49s
    6. Setting permissions using octal notation
      3m 49s
    7. The root user
      1m 57s
    8. sudo and sudoers
      5m 20s
  6. 52m 34s
    1. Command basics
      4m 4s
    2. The PATH variable
      4m 13s
    3. System information commands
      3m 40s
    4. Disk information commands
      6m 8s
    5. Viewing processes
      5m 0s
    6. Monitoring processes
      3m 36s
    7. Stopping processes
      3m 19s
    8. Text file helpers
      6m 50s
    9. Utility programs
      7m 28s
    10. Using the command history
      8m 16s
  7. 20m 39s
    1. Standard input and standard output
      1m 24s
    2. Directing output to a file
      4m 13s
    3. Appending to a file
      2m 44s
    4. Directing input from a file
      5m 28s
    5. Piping output to input
      4m 40s
    6. Suppressing output
      2m 10s
  8. 41m 28s
    1. Profile, login, and resource files
      9m 11s
    2. Setting command aliases
      6m 59s
    3. Setting and exporting environment variables
      4m 54s
    4. Setting the PATH variable
      6m 10s
    5. Configuring history with variables
      6m 17s
    6. Customizing the command prompt
      6m 5s
    7. Logout file
      1m 52s
  9. 1h 25m
    1. grep: Searching for matching expressions
      5m 21s
    2. grep: Multiple files, other input
      4m 28s
    3. grep: Coloring matched text
      2m 57s
    4. Introduction to regular expressions
      3m 22s
    5. Regular expressions: Basic syntax
      3m 19s
    6. Using regular expressions with grep
      5m 20s
    7. tr: Translating characters
      8m 17s
    8. tr: Deleting and squeezing characters
      5m 30s
    9. sed: Stream editor
      7m 45s
    10. sed: Regular expressions and back-references
      7m 8s
    11. cut: Cutting select text portions
      7m 42s
    12. diff: Comparing files
      4m 35s
    13. diff: Alternative formats
      4m 30s
    14. xargs: Passing argument lists to commands
      7m 25s
    15. xargs: Usage examples
      7m 59s
  10. 42m 25s
    1. Finder integration
      4m 45s
    2. Clipboard integration
      5m 5s
    3. Screen capture
      3m 42s
    4. Shut down, reboot, and sleep
      3m 34s
    5. Text to speech
      2m 36s
    6. Spotlight integration: Searching metadata
      3m 41s
    7. Spotlight integration: Metadata attributes
      4m 24s
    8. Using AppleScript
      5m 23s
    9. System configurations: Viewing and setting
      5m 51s
    10. System configurations: Examples
      3m 24s
  11. 1m 26s
    1. Conclusion
      1m 26s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ.

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Join now "Already a member? Log in

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Unix for Mac OS X Users.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferencesfrom the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Are you sure you want to delete this note?

No

Your file was successfully uploaded.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.