Join Michael Murphy for an in-depth discussion in this video Using metacharacters, the building blocks of GREP, part of Learning GREP with InDesign.
The building blocks of GREP are special characters or combinations of characters known as metacharacters, which stand in for and describe different types of characters, conditions and locations. InDesign has had its own metacharacters for some time, but there are key differences between those metacharacters and GREP metacharacters that you need to know about. First, let's take a look at where metacharacters exist in InDesign already. In Find/Change, which I'll access by going to the Edit menu, choosing Find/Change, or using the keyboard shortcut Command+F or Ctrl+F on Windows.
In a text-based search, if you are looking for something like a tab, you can't actually type it in this Find what field. If you do, you just jump back and forth from field-to-field. So a tab needs to be represented by something else, in this case, a metacharacter. Over to the right of that field, the special characters for search menu, the first item in the list is Tab. This is how you would indicate a tab in a search operation. When you choose that it puts in ^t. This is the metacharacter that represents a tab.
If you were looking for a Hard Return, or what InDesign calls the end of a paragraph, you would choose that, and get ^p. This is the InDesign convention for metacharacter syntax. They almost all start with a caret. I am going to close this window. Another place where you might have encountered these is if you were defining a Nested style]. Say, for example, you were creating a Paragraph Style. I am going to open the Paragraph Styles panel and choose New Paragraph Style from the panel menu.
If I am defining a Nested Style, under Drop Caps and Nested Styles, I will click New Nested Style, and for right now it doesn't matter what I define this as, but let's say I wanted to define a style through one tab. By default, Words is chosen here but from this menu I can choose Tab Character, and it actually puts in the text 'Tab Character'. If I were to type ^t, that would do the same thing. In fact, if I click off here, it actually swaps out the Tab Character for the metacharacter I put in.
But if I were defining this Nested Style through one Right Align tab, which is ^y, you'd see the metacharacter still in place. I'll cancel out of this. So these metacharacters show up in a number of places. Another one, in fact, is in InDesign's Bulleted and Numbered List function. With my cursor in this text and the Control panel in Paragraph mode, over here the Numbered List icon, if I option or Alt+Click that icon, I get the Bullets and Numbering dialog box.
If I am defining a Numbered list by choosing numbers from the list type you can see here in the Number field that everything to define the formatting of that number is defined by metacharacters. There is a Number placeholder beginning with a caret, and then again, ^t for a tab. So metacharacters have been around for a while and they define a lot of things in InDesign. But their syntax is somewhat different when you are working with GREP. I am going to cancel out of this and we're going to go back to the Find/Change dialog. Edit>Find/Change, and I am going to switch over to the GREP tab.
The dialog boxes look very similar, but when I go to the Special Characters for Search menu and click that, I have a much longer menu here. Tab, once again, is the first choice. If I click that, I get \t, not ^t. This is the significant difference between the syntax of InDesign metacharacters and GREP metacharacters. If I choose End of Paragraph, which is a Hard Return, I get \r, not ^p as we saw in the Text field. So that's one main difference.
GREP has been around much longer than InDesign and fortunately Adobe did not create their own version of GREP and change the syntax for people who are familiar with GREP walking in. But there are a handful of characters that are entirely unique to InDesign that GREP was never intended to deal with. These are things like the Current Page Number marker or Anchored Object markers or any of the dozen custom white spaces that are available to you in InDesign. In order to account for those, there is yet another change in syntax.
For example, if I click here and I choose from a Markers submenu, the Current Page Number marker character, that is described as tilde uppercase N. If I go back to the menu and choose something like a Flush Space from the White Space sub-menu, that's tilde lowercase f. These InDesign specific characters all start with a tilde instead of a backslash. So there are two syntax changes that you need to be aware of between InDesign metacharacters and GREP metacharacters.
Now there are a great many metacharacters in this menu. Each of these submenus can be very long or somewhat short. But it's a lot to remember, and most of it you're probably not going to commit to memory. Basics like Tab and Return you probably will over time, but if you want to have a reference for all the different metacharacters that are available to you, or at least most of them, if you go up to the Search field in the Application Bar and simply type in 'metacharacter', hit Enter.
You'll either access Adobe's Online Community Help, if you are connected to the web, or if you're not you'll get the Local Help that's on your hard drive. The first match is InDesign CS4 Metacharacters for searching. Click that link and you'll get a table that shows you the character that you are looking for, a column with the Text Search metacharacter, which is InDesign's convention, and then a column with the GREP metacharacter that you would need to use. And if you look at them side-by-side, you can see, for the most part, they are quite similar.
Carets versus backslashes or carets versus tildes is about the only real difference. The End of Paragraph character is the most different. It uses a P instead of an R, but by and large, they are very similar in these two columns. There are, however, a few metacharacters that are not listed here, and we'll encounter those later on in the course. So just to review, metacharacters represent other characters and, in most cases, GREP metacharacters differ only from InDesign metacharacters, in that they begin with a backslash or tilde instead of a caret.
But that's not true for every metacharacter. As with all rules, there are exceptions.
- Using metacharacters, the building blocks of GREP
- Describing text that may not exist with zero operators
- Applying multiple character styles to the same text with GREP styles
- Eliminating orphaned words at the ends of paragraphs
- Preserving and recalling subexpressions
- Customizing a GREP-based text cleanup script for long documents
Skill Level Intermediate
Q: In the “Dynamically fixing orphaned words with GREP” tutorial the author uses the term:
In an earlier course the author described the + (one or more) modifier as unusable in a lookbehind or lookahead i.e. (?<=.+). What's the difference here?
A: The limitation mentioned in an earlier movie referred only to positive lookbehind and negative lookbehind. I was able to use the one or more times (+) metacharacter in the positive lookahead portion of the expression because that limitation doesn't affect either positive or negative lookahead. It's only when looking backward that GREP ignores the repeat metacharacters.