Note: Because this is an ongoing series, there is no certificate of completion available for this course.
Skill Level Intermediate
- [Instructor] If something goes wrong on your system, one way to approach recovering it is to start up from a live boot environment, like the one that's provided on many installation images. Starting from a live boot environment gives you a stable, reliable platform from which to explore a broken system. A live environment gives you the opportunity to check out hardware, view logs, repair file systems, and even chroot into the existing system and fix broken updates and configurations. A good first step when you're rescuing a system is to check out the file system that it had stored on.
Sometimes a live environment will automatically mount file systems it finds, and in order to check the file system you'll need to unmount it. If it's encrypted you'll need to open the encrypted container before you scan the file system. Then you need to find the device named check. I'll switch over to my system here. I've plugged in an installer disk into my regular machine and booted it into the live CD or tryout portion. I'll open a terminal with control alt t and I'll zoom in a little bit.
I'll type lsblock to list my block devices. Here's my regular system's root file system. Depending on the file system, you'll use different commands to check, and then I'll make sure it's unmounted with umount and the path to the volume. In my case it's an lvm volume, so it'll be dev/mapper/ubuntu vg root. It's not mounted. Depending on the file system you'll use different commands to check it. But in general, an ext two, three, or four volume will need e2fsck.
It looks like mine's okay. At this point we have two options. If we just want to retrieve files or to look at files that are on the system's root file system, we can remount it and work like that. Because a live CD provides a full Linux environment, copying critical files off of a hard drive and exporting logs for analysis are fairly straightforward. Your live environment will probably be able to support joining an wifi network or an ethernet network or at least provide USB access to let you copy files on and off of your system. Remember, though, live boot environments generally don't have any kind of persistence.
So if you do try to copy files off your real system, make sure to put them on something different than the live environment's file system. However, if we want to actually modify the system, we need to take a few more steps. Right now, we're using the live boot environment. So the tools that we run there will work relative to that. If we try to install software, for example, it'll install into the live boot environment, albeit, temporarily, and not into the real system. And the logs at varlog will be logs for the live environment, not the system that I'm troubleshooting. Using chroot we can step inside of the actual file system and work with it as we would ordinarily.
Chroot allows us to change the root of a file system to a particular path. It's often used to isolate processes and users from the larger system. But it's also useful to step inside of a system on an attached disk. In order to step inside my normal system I need to set up a few mount points that the software will need. I'll store these in a folder inside mnt called my system. I could mount them into another folder like mnt itself. This keeps things organized, though, and lets me access anything else that might be an mnt already.
This folder will be where I direct chroot when I'm done adding mount points. So to begin I'll mount my system's root file system here into mnt mysystem. After that I need to mount the dev, proc, and sys file systems in order for my system to know about the system's hardware and status. I'll write mount -o bind dev into mnt/mysystem/dev.
And I'll write mount -t proc for the proc file system into mysystem/proc. And mount -t sysfs for the sys file system into my system sys. I also need to mount any other partitions that my system would expect to have in the same way. So what we've done is create a little tree of mounts within this folder.
Now we'll use chroot mnt/mysystem and bin/bash to change the file system root for this session to that little file system tree and open up a bash shell from that file system to work with. I'll clear the screen and I'll run a command that I know I have installed on my real system but that isn't installed in this live CD environment.
That looks familiar. This lets us install, upgrade, or remove packages, modify system files, fix the boot loader and more. A useful step in diagnosing a broken system is to look at the logs and see if anything jumps out. You'll want to search for things like error or failure and that might lead you to a hardware problem, like errors from a hard disk as it fails. Logs can also alert you to failed updates or other conditions that can be resolved. Generally speaking, the log files should point you in the right direction. If something serious happened it should be visible in the logs directly or indirectly.
If a system stops booting or starts crashing without any indication in the logs, though, that can be a little bit more difficult to pin down. If there's nothing that stands out in the logs it's time to start poking around a little bit more. A good place to start is to look at the hard disk and make sure that the file system's aren't out of space. Sometimes on Ubuntu a boot partition can fill up and cause problems when trying to update or start the system. This should be resolved with apt autoremove. If your other file systems are out of space, consider moving some bulky files off that file system onto other storage.
When you're done working with a chroot environment, type exit to move back out of it to the live environment, and then you can restart to see if your fixes worked. Sometimes if a system is in an unrepairable state all that's left is to back up your data and reinstall. That's never fun because you don't really get the satisfaction of figuring out and fixing a problem. But what's more important is that a broken system gets back up and running. You can make a backup fairly easily by copying files from your previous hard disk or your home folder or whatever contains the information you want to save with rsync or archiving them with tar.
Or you can make a full image of the partition with DD or other imaging tools. If you do reinstall, you can restore this backup and make sure to repair the permissions on it if your user ID changed. Troubleshooting is an approach to a problem more than it is a process. So having some tools like logs, disk repair, chroot, and a live environment, allow you to work flexibly to solve whatever problem you face.