From the course: Bash Patterns and Regular Expressions

Regexes in if conditionals - Bash Tutorial

From the course: Bash Patterns and Regular Expressions

Start my 1-month free trial

Regexes in if conditionals

- [Speaker] The only support for regular expressions in Bash is in the double square bracket, if conditional. In all other cases, the best pattern matching is extended globs, which we covered earlier in this course. To use extended globs for pattern matching in Bash, we use the = or = + ~ operator. In Bash there is no difference between either, so use the one you prefer. For this example, I'm using the exact same extender glob that we created in chapter two of this course. Notice that I've stored the extended glob in a variable, and then used the variable in the if conditional. I did this to keep the pattern separate for trouble-shooting reasons, and also to keep the formatting of the if conditional simpler. One advantage of extended globs is that we can also use them in the for loop, which we cannot do with a regular expression. Now let's talk about a simple regular expression solution to this problem. Bash's in process regular expressions support ERE's only, and only inside double square brackets using the = + ~ comparison operator. This example would prove to be true if the value of the file variable was text that started with the word Backup, was followed by number of characters, and ended with .*tar.gz. One thing to keep in mind when using regular expressions in Bash is that you cannot place quotes around the regular expression in the if conditional. Now not only is it not required to quote anything inside the double square brackets, it actually breaks the regular expression matches. I've created two scripts. One using extended globs in an if conditional, and the other using a regular expression. I've included them in the chapter five directory of the exercise files for this course. Let's take a look at one. In the chapter five directory type in v + i + space + regexfiles.sh and hit enter. Notice that I've included the extender globs solution to this problem in the script as well, so we can directly compare the two. This script loops through files in the backup files directory, and then passes each one to the if conditional using Bash's in process regular expressions to match the name. If the name matches the regular expression, it echoes a statement. Now, let's compare the extended glob to the regular expression. The first thing that you may notice is that the regular expression supports anchors. We begin matching either Archive or Backup in the file name. In this case, extended globs and extended regular expressions are very similar, with both supporting alternation and grouping. When we compare the ERE digits, we see that regular expressions are more powerful with the occurrence operand. In the extended glob, we need to write out four character class ranges. With a ReGex, we just specify four of the previous character class range. The same goes for the month. We provide two character classes in the extended glob and one occurrence operand in the ReGex. The one or two digit day is done in the extended glob by creating a group, and using alternation. For the ReGex, we just provide a range in the occurrence operand. How we handle the various mix of extension combinations is very similar for both, with the regular expression being slightly cleaner and shorter. In this particular example, we were able to use either an extended glob or a regular expression. However, you may recall that extended globs are for matching files, and regular expressions are for matching text. In our case, we place the string of characters in a variable, and did the comparison with it. So effectively, both methods are the same. In your terminal exit the script by pressing Esc + : + Q + ! and hitting enter. We've talked about speed before in this course. I have two scripts in the chapter five directory, globfiles.sh and regexfiles.sh. They do the same thing, but globfiles.sh uses an extended glob, and regexfiles.sh uses a regular expression. Let's time globfiles.sh, while redirecting the output to /dev/null. In the terminal type in time space ./globfiles./globfiles.sh space > space /dev/null and hit enter. Now let's do the same for regexfiles.sh. Type in time space ./regexfiles.sh space > /dev/null and hit enter. You can see that the glob version is much faster. The point of that comparison is to use the best tool for the job. The regular expression in the script is easier to create, more powerful, and easier to read. However, without power comes a speed penalty.

Contents