Learn about the ability to deal with most of the common ways to use find to locate files.
- [Instructor] Welcome to the third video of section two, Finding Files and File Listing. In the previous video we have learned how to record and playback of Terminal sessions. In this video we're going to learn how to deal with most of the common ways to utilize find to locate files. Find is one of the great utilities in the Unix slash Linux command line toolbox. It is a very useful command for shell scripts, however many people do not use it to its fullest effectiveness. The find command uses the following strategy.
Find descends through a hierarchy of files, matches the files that meet specified criteria, and performs some actions. Let's go through different use cases of find and it's basic usages. To list all files and folders from a current directory to its descending child directories use the following syntax. Base_path can be any location from which find should start descending.
For example, /home/slynux. An example of this command is as follows, . specifies current directory, and .. specifies the parent directory. This convention is followed throughout the Unix file system. The -print argument specifies to print the names path of the matching files. When -print is used slash N will be the delimiting character for separating each file.
Also note that even if you omit the -print, the find command will print the file names by default. The -print zero argument specifies the matching file name printed with the delimiting character slash zero. This is useful when a file name contains a space character. In this video we have learnt the usage of the most commonly used find command with an example. The find command is a powerful command line tool and it is armed with a variety of interesting options.
Let's take a look at them. Search based on file name or regular expression match. The -name argument specifies a matching string for the file name. We can pass wild cards as its associated text. A star .txt command matches all the file names ending with .txt and prints them. The -print option prints the file names or file paths and the Terminal matches the conditions. For example, -name has given us options to the find command. The find command has an option, -iname, ignore case, which is similar to -name, but matches file names while ignoring the case.
For example. If we want to match either of the multiple criteria we can use all conditions, as shown in the following.
The previous command will print all of the txt and mp4 files, since the find command matches both .txt and .mp4 files, and are used to treat -name, *.txt, -o, - name, *.mp4 as a single unit. The -path command can be used to match the file path for files that match the wild cards. - name always matches the given file name, however -path matches the file path as a whole. For example.
The -regex argument is similar to -path, but regex matches the file paths based on regular expressions. Regular expressions are an advanced form of wild card matching, which enables us to specify text with patterns. By using patterns we can make matches to the text and print them. A typical example of text matching using regular expressions is passing all email addresses from a given pool of text. An email address takes the form firstname.lastname@example.org, so it can be generalized as bracket A dash zed, zero dash nine closed bracket plus at open bracket A dash zed, zero dash nine closed bracket at dot A dash zed, zero dash nine closed bracket plus.
The plus sign signifies that the previous class of characters can occur one or more times repeatedly in the characters that follow. The following command matches the .py or .sh files. Similarly, using iregex ignores the case for the regular expressions that are available. For example.
Negating arguments. Find can also exclude things that match a pattern using exclamation mark. This will match all the files whose names do not end in .txt. The following example shows the result of the command. Search based on the directory depth. When the find command is used it recursively walks through all of the subdirectories as much as possible until it reaches the leaf of the subdirectory tree.
We can restrict the depth to which the find command traverses using some depth parameters given to find. - maxdepth and -mindepth are the parameters. In most cases we need to search only in the current directory, it should not further descend into the subdirectories from the current directory. In some cases we can restrict the depth to which the find command should descend using the depth parameters. To restrict find from descending into the subdirectories from the current directory the depth can be set as one.
When we need to descend to two levels the depth is set as two. And so on for the rest of the levels. For specifying the max depth we use the -maxdepth level parameter. Similarly, we can also specify the minimum level at which the descending should start. If we want to start searching from the second level onwards we can set the minimum depth using the -mindepth level parameter. To restrict the find command to descend to a maximum depth of one use the following command.
This command lists all the files who names begin with f, but only from the current directory. If there are subdirectories they're not printed or traversed. Similarly, -maxdepth 2 traverses up to at most two descending levels of subdirectories. - min depth is similar to -maxdepth, but it sets the least depth level for the find traversal. It can be used to find and print files that are located with a minimum level of depth from the base path. For example, to print all of the files whose names begin with f and are at least two subdirectories distant from the current directory use the following command.
Even if there are files in the current directory or directory one and directory three it will not be printed. - maxdepth and -mindepth should be specified as the third argument in the find command. If they are specified as the fourth or further arguments it may affect the efficiency of find, as it has to do unnecessary checks. For example, if -maxdepth is specified as the fourth argument and -type is the third argument the find command first finds out all the files having the specified -type and then finds all of the matched files having the specified depth, however if the depth was specified as the third argument and the type as the fourth find could collect all the files having at most the specified depth and then check for the file type, which is the most efficient way to search.
Search based on file type. Unix, like operating systems, treat every object as a file. There are different kinds of files, such as regular file, directory, character devices, block devices, symlinks, hard links, sockets, FIFO, and so on. The file search can be filtered out using the -type option. By using -type we can specify to the find command that it should only match files having the specified type. List only directories, including descendants, as follows.
It has hard to list directories and files separately, but find helps you do it. List only regular files as follows. List only symbolic links as follows. You can use the type arguments from the following table to properly match the required file type. Search on file times. Unix slash Linux file systems have three types of timestamps on each file. They're as follows. Access time, -atime, it is the last timestamp of when the file was access by a user.
Modification time, -mtime, it is the last timestamp of when the file content was modified. Change time, C time, it is the last timestamp of when the metadata for a file, such as permissions or ownership, was modified. - atime, -mtime, and -ctime are the time parameter options available with find. They can be specified with integer values in number of days. These integer values are often associated with minus or plus signs.
The minus sign implies less than, whereas the plus sign implies greater than. For example, print all the files that were accessed within the last seven days as follows. Print all the files that are having access time exactly seven days old as follows. Print all the files that have an access time older than seven days as follows. Similarly, we can use the -mtime parameter for search files based on the modification time and C time for search based on the change time.
- atime, -mtime, and -ctime are time-based parameters that use the metric in days. There are some other time-based parameters that use the time metric in minutes. These are as follows. For example, to print all the files that have an access time older than seven minutes use the following command. Another good feature available with find is the -newer parameter. By using -newer we can specify a reference file to compare with the timestamp. We can find all the files that are newer, older modification time, than the specified file with the -newer parameter.
For example, find all files that have a modification time greater than that of the modification time of a given file.txt file as follows. Timestamp manipulation flags for the find command are very useful for writing the system backup and maintenance scripts.
Search based on file size. Based on the file size of the files a search can be performed as follows. Instead of k we can use different size units, such as the following, b, 512 byte blocks, c, bytes, w, two-byte words, k, kilobyte, 1024 bytes, M, megabyte, 1024 kilobytes, G, gigabyte, 1024 megabytes.
Deleting based on the file matches. The -delete flag can be used to remove files that are matched by find. Remove all the .swp files from the current directory as follows. Matched based on the file permissions and ownership. It is possible to match files based on the file permissions. We can list out the files having specified file permissions as follows, - perm specifies that find should only match files with their permissions set to a particular value.
Permissions are explained in more detail in the file permissions ownership and sticky bit section, section three, file in, file out. As an example usage case we can consider the case of the Apache web server. The PHP files in the web server require proper permissions to execute. We can find out the PHP files that don't have proper execute permissions as follows. We can also search files based on ownership of the files. The files owned by a specific user can be found using the -user, user option.
The user argument can be a username or a UID. For example, to print the list of all files owned by the user slynux you can use the following command.
Executing commands or actions with find. The find command can be coupled with many of the other commands using the -exec option. It's one of the most powerful feature that comes with find. Consider the example in the previous section. We used the -perm to find out files that do not have the proper permissions, similarly in the case where we need to change the ownership of all files by a certain user, for example root, to another user, for example www-data, the default Apache user in the web server.
We can find all the files owned by root by using the -user operation and using -exec to perform the ownership change operation. You must run the find command as root if you want to change ownership of files and directories. Let's take a look at the following example. In this command curly braces is a special string used within the exec option. For each file match curly braces will be replaced with the file name for -exec. For example, if the find command needs two files, test one.txt and test two.txt with the owner slynux, the find command will perform.
This gets resolved to chown slynux test one.txt and chown slynux test two.txt. Sometimes we don't want to run the command for each file. Instead we might want to run it a fewer times with a list of files as parameters. For this we use plus instead of semicolon in the exec syntax. Another usage example is to concatenate all the C program files in the given directory and write it to a single file, let's say all_c_files.txt.
We can use find to match all the C files recursively and use the cat command with the -exec flag as follows. - exec is followed by any command, curly braces is a match, for every matched file name curly braces is replaced with the file name. To redirect the data from find to all_c_files.txt file we have used the greater than operator, instead of greater than, greater than, append. Because the entire output from the find command is a single data string, S-T-D-N, append is necessary only when multiple data strings are to be appended to a single file.
For example, to copy all of the text files that are older than 10 days to a directory OLD use the following command. Similarly, the find command can be coupled with many other commands. - exec with multiple commands. We cannot use multiple commands along with the -exec parameter. It accepts only a single command, but we can use a trick. Write multiple commands in a shell script, for example, command S-H, and use it with -exec as follows.
- exec can be coupled with printf to produce a very useful output. For example, skipping specified directories when using the find command. Skipping certain subdirectories for performance improvement is sometime required while doing a directory search and performing an action. For example, when programmers look for particular files on a development source tree, which is under the version control system, such as Git, the source hierarchy will always contain the .git directory in each of the subdirectories. .git stores version control related information for every directory.
Since version control related directories do not produce useful output they should be excluded from search. The technique of excluding files and directories from the search is known as pruning. It can be performed as follows. The preceding command prints the name path of all the files that are not from the .git directories. Here \( -name ".git" -prune \) is the exclude portion, which specifies that the Git directory should be excluded and \( -type f -print \) specifies the action to be performed.
The actions to be performed are placed in the second block. - type f -print, the action specified here, is to print the names and path of all of the files. Awesome, in this video we've successfully learned about finding files and file listings. In the next video we'll learn how to play with xargs.
Note: This course was created by Packt Publishing. We are pleased to host this training in our library.
- Printing in the terminal
- Performing math in the Linux shell
- Getting and setting dates
- Working with functions and arguments
- Reading output
- Making comparisons
- Concatenating text
- Finding, editing, generating, and deleting files
- Running parallel processes
- Using regular expressions
- Downloading webpages
- Parsing data from a website
- Finding broken links
- Backing up and archiving
- Transferring files and data through the network
- Monitoring your Linux system
- Gathering data for system administration