Join Scott Simpson for an in-depth discussion in this video System basics: The Linux file system, part of Linux Tips Weekly.
- [Instructor] On nearly all Linux systems files and folders are organized in a specific way according to the Filesystem Hierarchy Standard, or FHS. This allows both software and users to find what they're looking for in a predictable way. At the base level of the hierarchy, from where the rest of the structure is defined, is the root, represented with a slash. You can think of this like the root system and trunk of a tree. Everything in the filesystem branches off from it somewhere, even other disks and network shares. The root supports the whole structure and there's only one root on a system.
Because of this unified structure individual folders can be relocated to different disks or volumes to allow for more granular management of storage space while keeping the references to them uniform and predictable. In the Filesystem Hierarchy Standard there are a number of folders and subfolders branching off from root, each with a specific purpose. The bin folder contains binaries, or programs, that the system needs to work, even in single user mode. This includes the shell, tools like ls and cat, and other critical components for working with the operating system.
Boot contains the kernel and initial RAM disk, which the system needs in order to start up. The dev folder contains files that refer to devices on the system, including disks, virtual devices, and more. Configuration files for software and tools are kept in the etc folder, some at the base level and others in more specific subfolders under the etc path. Users' home folders where all of their personal files are kept are located under the home folder. The lib folder stores the libraries associated with essential system binaries.
Removable media is mounted into the media folder and other media, like hard drives and network mounts, are usually mounted in the mnt folder. Optional packages, usually software created by or installed by the user and which are not required for the operation of the system are stored in the opt folder. Proc is a folder which presents information about the kernel and processes running on the system as files allowing software and users to directly observe various kernel settings and process parameters. Many tools read from the proc filesystem in order to display information about the system's activity.
Rather than store its files in the home folder root's personal documents are stored in the root folder. Run stores information about running processes like lock files and process ID files, and other information that's relevant to the system as it runs like the message of the day and whether a reboot is required. The sbin folder stores system binaries, tools that we use to manage nearly every aspect of the system, from tools that create filesystems on disks to managing the network connection, and other activities. The srv folder is set aside as a place to store information that various server software provides to clients, be that FTP, NFS, or other software.
Of course, files you share don't have to be located here but the standard provides it in case it's needed. The sys folder gives us files that represent certain aspects of the system, lists of disks, devices, filesystem information, and so on. Temporary information is to be stored in the tmp folder. Because this folder can be cleared out occasionally any information that you want to save needs to go elsewhere. But, while information's being worked on, tmp provides a space for it. This can be helpful because the permissions on the tmp folder are set so that programs can put information there.
If you want to put information somewhere else you would need to ensure that the program had permission to do so. Instead, the system gives us a space where we don't have to worry about that. The usr folder contains tools and utilities for many purposes on the system. Beneath it are various other subfolders for specific purposes, including various kinds of system and user binaries in usr/bin, urs/local, and usr/sbin. Finally, the var folder holds files that can vary but are not considered temporary. There's a series of subfolders there as well for particular purposes like the system logs.
While the system and various software know where to look for certain files the shell environment doesn't, and so we need to tell it where to look for the programs that the user might expect to use. At the command line, as we'll see later on, we can type out the full path to a program but that's a little bit annoying. So, the environment has a variable set inside of it called PATH, which is a listing of places to look when a user types in a command. I can see what the current path is with echo $PATH. The folder is where the shell will look for files are listed here separated by colons.
Yours may be different, but typically there's a few common folders in here like bin and sbin so the basic commands work without having to use the full path. We can add to this PATH variable, too, either temporarily or permanently. We will do that as we get into modifying the environment later on. It can take a bit of time to get a good sense of where to find files that you're looking for, especially binaries because, as we've seen, they can be in three or four places by design. It's helpful to know a few commands that will show you where files are located. The first one, which works for files and folders that are listed in the PATH variable is called which.
So, if I wanted to know where the ls program was, for example, I could type which ls. If you're looking for a file that's outside the PATH there are two options called locate and find. Locate uses a database that's generated periodically to nearly instantly search for a file. The find command searches the disk for files that match a pattern you give. Let's look at both. Locate is fast but the results that it has can be a little bit stale. That's because the process that feeds information into its database runs daily. But, once that information's available, searching with locate is extremely fast so it's especially helpful on large storage volumes like a media drive or a department's file share.
Let's look for all the .h files on a system. All right, locate *.h. There should be a bunch in the kernel sources. And, there they are. That was fast, though if one of these files were to be deleted, or if we added some new ones, locate wouldn't know about it until tomorrow. The find command would, though, because when we use it to look for a file we give it a path to look in, and find digs through the filesystem starting at that path and checks to see if every file it encounters matches the search term. So, it's slower but it gives more up-to-date results.
I'll clear the screen and then let's use find to look for those same .h files. I'll write find / -name *.h, and here they are. We'll get into more advanced usage of find later on. Getting familiar with the standard locations for different types of files can help you navigate an unfamiliar system, make changes to software you haven't encountered before, and design backup strategies. And, the standard helps us keep things organized by providing expected locations for files rather than scattering them all over the place wherever an application developer thought files should go.
Note: Because this is an ongoing series, there is no certificate of completion available for this course.