From the course: Bash Patterns and Regular Expressions

What are extended globs? - Bash Tutorial

From the course: Bash Patterns and Regular Expressions

Start my 1-month free trial

What are extended globs?

- [Instructor] In addition to standard globs, Bash supports extended globs. Extended globs gives us more of the power of regular expressions for globbing. Unlike character sets or character classes, patterns can be more than one character and we can match multiple occurrences of a pattern. Although we can't specify the number specifically, we can say zero or one, exactly one, one or more, or zero or more matches. Like a regular expression, they allow grouping patterns as well as nesting pattern groups for more control. Standard globbing provides a limited, logical AND operation by including more than one search item in the criteria as long as the order is respected. But extended globs add a logical OR. We can match one pattern or another. With nested groups, we can do a full logical AND operation as well. So let's talk more about these things. With extended globbing, we can match exactly one occurrence of a pattern. To do so, we'd use the at symbol before our match such as photo@(.jpg). This means that the text outside of the extended glob needs to be matched as text. And because we're using the at symbol, what is inside also has to match exactly one time. This example's not very useful as we could just leave the extended glob out and specify the entire text as a string. However, we can include a logic of OR to make it more useful. We do this by including more than one pattern separated by the pipe symbol. Now the glob will only match photo.jpg or photo.png but not photo with no extension, photo.gif or photo.jpg.jpg because the at symbol specifies exactly one match has to exist, no more and no less. Using the question mark we can specify zero or one occurrence of a pattern. The text to the left of the question mark is not in the glob so we have to match it. The patterns inside the glob are .jpg and .png. As a result, this would match the word photo which would be zero occurrences of either pattern. It would also match photo.jpg and photo.png which would be one occurrence of either pattern. It would not match photo.jpg.jpg or photo.png.png as they contain more than one occurrence of the pattern. To match one or more occurrence, place a plus symbol before the parenthesis. Again, we have to match photo to the left of the plus symbol as it's not in the glob. We also have to match one or more occurrence of .jpg or .png. This means this glob will match photo.jpg, photo.png as well as photo.jpg.jpg and photo.png.png. It will not match photos out on extension as one occurrence is necessary. It will also not match photo with any other extension or any other characters. To match zero or more occurrences, we'll want to use an asterisk. This is a combination of the question mark which is zero or one occurrences and the plus which is one or more. Photo*(.jpg|.png) will match everything that the question mark and plus symbols matched combined. This would match photo with no extension as well as with the .jpg or .png extensions in any number of occurrences. We can also invert our match by using the exclamation point before the parenthesis. Photo!(.jpg|.png). In this case, we're inverting the match of one pattern or another. This will show any file named photo that is not immediately followed by .jpg or .png. Notice that when we invert, we replace the operator defining how many times are allowed with the inversion operator. If we want to include any of the operators already covered, then we need to nest our extended globs. To demonstrate this, let's recap for a moment. I'll use the question mark extended glob as an example. The photo?(.jpg|.png) extended glob would match photo because that's zero occurrences. Photo.jpg and photo.png as those are one occurrence. They will not match anything like photo.gif, photo.bmp or even photo.jpg.jpg because either that's a different pattern or more than one occurrence of the specified pattern. Now that we have that established, let's invert the entire match by encapsulating it in another extended glob. Photo!(?(.jpg|.png)). You may notice that the match and does not match columns just switched. And now matches photo.gif, photo.bmp, photo2.jpg, photo.png.png and photo.jpg.jpg. Sometimes it's easier to create an extended glob to match what you don't want and then add the nested inversion glob when you're done. You can group multiple extended globs together and then treat them as one match by creating a group. In this example, we would match one or more files that begin with the words photo or file, have any number of any characters and end with one or more of jpg or gif, and then we would invert that result and ultimately match all files that do not start with photo or file and do not end with jpg or gif. If you want to test any of my example extended globs, I've included all of the files I've mentioned in this video in the Chapter 2/globfiles directory of the Exercise Files. Use the extended globs we've covered here with the ls command to test them.

Contents