Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

Finding and replacing using backreferences

From: Using Regular Expressions

Video: Finding and replacing using backreferences

In this movie, I want us to take a look at Find and Replace using backreferences, because this is a very powerful use of backreferences, and of regular expressions in general. Now, in order to look at Find and Replace, we won't be able to use the JavaScript RegexPal we have been using, because all it does is simply tell us, did we find a match or not? It doesn't do Find and Replace. In order to do Find and Replace, we'll either need a programming language, or we'll need a text editor, and that's what I am going to be using. Now, if you have a text editor you can go ahead and use it, or you can install one like mine, or you can just simply follow along for this one movie to get the idea.

Finding and replacing using backreferences

In this movie, I want us to take a look at Find and Replace using backreferences, because this is a very powerful use of backreferences, and of regular expressions in general. Now, in order to look at Find and Replace, we won't be able to use the JavaScript RegexPal we have been using, because all it does is simply tell us, did we find a match or not? It doesn't do Find and Replace. In order to do Find and Replace, we'll either need a programming language, or we'll need a text editor, and that's what I am going to be using. Now, if you have a text editor you can go ahead and use it, or you can install one like mine, or you can just simply follow along for this one movie to get the idea.

The text editor I'll be using is TextMate for Macintosh. If you're on Windows, the E Text Editor is a very similar product. So the file that I am going to be demonstrating with is in the exercise files, and its called us_presidents.csv. CSV stands for comma separated values, and it's a list of the U.S. Presidents, with each of the values separated by commas. And those values are the number that they served, their term, their name, the start date, the end date, their party, their state that they came from, and the URL for their Wikipedia entry.

What we want to do for this exercise is to take their name, like George Washington, and separate those into two separate fields, and flip them around. So instead it will be Washington, George. Essentially, instead of just name, we are looking for last name, and then comma, first name; that's what we are going for. So we want to use regular expressions and backreferences with Find and Replace to make that happen. The way that we do Find in TextMate is under the Edit menu; there is Find. If you pull down to that, pull out here to just simple Find. You'll also see that there is a shortcut, which is Command+F, and also before we select that, I just want you to see that down here there is another shortcut Command+G, which is Find Next. I am going to be making use of that as well.

So we pick Find, and it comes up with a basic Find window, and this will let us find just literal text inside the document. If you want a bigger window to work with, you can actually click this here. I am going to go ahead and leave it small. There is no reason that I need all that extra room. Notice here that it says Regular expression. If I don't want to search for literal text, I check this box, and now in this Find box, I enter regular expression syntax. So if this were a problem in the real world that I was trying to solve, I would try and solve it in three steps. I would first write a regular expression that matches the thing that I want to replace. Second, I'd put in the capturing groups to capture the things I want, and third, I would write the replacement string.

So let's start by first writing that regular expression. In order to make sure that I'm getting that first name and last name column, there are a couple of different ways that we could do that. The way that I'm going to do it is I am going to go ahead and just start at the beginning of the line, and then I'm going to go ahead and provide a regular expression that extends beyond the name as well. It's probably not necessary to write quite as much as I'm going to, but it's not a bad set of precautions to take to make sure that you're finding exactly what you want, and nothing else. So at the beginning of the line, I am going to say I am looking for a digit, and it could either be one digit or two digits, followed by a comma, and then the name of the person.

Notice that these first names can include their middle initial and a period, so I am going to say that would be a word character, or a space, or a period inside a set. There's got to be at least one of them, and then it would be repeated. Let's make it not greedy, and then a space, and then we need to do the last name. And the last name, then, would be the same thing, but no period necessary this time. We'll make it not greedy as well. It's a little bit overkill, but I would go ahead and say backslash d, and four to make sure that I was getting those four digits, and that way I can make sure that I am matching exactly what I want; that I am getting the expected column.

So let's try that out. Hit Return, and it found George Washington. Now I am going to use Command+G, just to go to the next one. Scroll through the list, and make sure that it's matching what we expect it to match. So once I've done enough of those to satisfy myself -- you could go through the whole list if you wanted -- click back up here at the top of the document again, let's go back to Find, and now I am going to put in the capturing groups. Now, here is an important point: when we do a match here -- just watch the match again, and see what matched -- anything that's part of that match is going to be part of the replacement.

So anything we want to stay there, we've got to put back there again. So if I want that number one there, I have got to put the number one, but the number one varies from line to line. It's number two when we get to John Adams, and number three when we get to Thomas Jefferson. So therefore, what I need to do is I need to capture it by putting those parentheses around it, and then now I can replace it with itself down here. We talked about backslash one being the way to make backreferences. Some editors, including TextMate, will use dollar sign one when you're down here in the replacement string.

It uses backslash one when we are putting it in the same string; dollar sign one when it's down below. I could capture the comma too, but I am going to go ahead and put the comma in. Let's go ahead and do this name, so we've got the first name, and any middle initial, followed by a space, and then the last name, and then a comma, and then again we want to make sure that we capture those last four digits as well. Now to start with, I am just going to put in dollar sign, two, space, dollar sign, three, comma, dollar sign, four.

That would replace it with exactly what it found; it would not do any transformation. Instead, though, what we want to do is change it so that we have three, comma, dollar sign, two. So now three, which is the thing being captured as the last name, will come first, followed by a comma, and then the first name and middle initial, followed by another comma. Let's click back here at the beginning, and then once we are in that top line, let's click next to find it, and then let's watch it. Let's click Replace & Find.

Do this replacement, and find the next one at the same time. George Washington did exactly what we expected. Let's try John Adams; same thing. So we can go down these, and we can just watch it. I typically would watch to make sure something like John Q worked out. Van Buren; that worked out okay, and then once we are satisfied that enough of these are working, you can just hit Replace All, and it will do the whole list for you. Now we've accomplished our Find and Replace. We've saved ourselves a whole lot of data entry, and a whole lot of typing, by using the power of regular expressions, and the power of backreferences.

So to summarize the steps that we followed, first we created a regular expression that matched our target data; we tested it and revised it using anchors and more specific regexes to narrow the scope. Then we added in the capturing groups. We put the parentheses around the parts we wanted to capture, and you specifically want to capture anything that varies from row to row, because we are going to use the backreferences to reference the actual data that's there as the expression moves down, row to row. And then, last of all, write the replacement string.

We want to make sure we use all captures, assuming that we do want to keep all of them, and add back anything that was not captured, but still needed, such as the commas that we put in between our different values. And remember that you may need to use dollar sign, one, instead of backslash, one. If you followed these simple steps, it will help guide you as you unlock the power of using backreferences with Find and Replace.

Show transcript

This video is part of

Image for Using Regular Expressions
Using Regular Expressions

59 video lessons · 12234 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 26s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 30s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 38s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 46s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 31s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 10s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ .

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member ?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferences from the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.