Tuesday, February 9, 2010

Checking your disk partitions without rebooting Linux

Sometimes you might find yourself in the situation of having a power failure, and your Linux distro just started up without checking the disk. This is especially a problem if you are using the older non-journaling EXT2 file system.
There are other situations in which you would want to manually check the integrity of your partitions and avoid the hassle of rebooting the system if you find that nothing wrong happened. So, let's see how we can achieve that.

The first thing that we need to do is temporarily shut down the graphical system, as desktop environments can keep open handles for quite a large number of files.
Before that, we will switch to a text terminal, by pressing CTRL+ALT+F2 - that is the F2 key, not the F and 2 keys. You could press any of the F1-F7 keys instead, depending on the configuration of your system, so if F2 doesn't work try one of these.
This will drop you to a login prompt, where you should login as the root user. If you are on one of the systems that make use of the sudo command (like Ubuntu), just login with your normal username and password, then type sudo su to have the same permissions as root.
Now that we have root privileges, let's kill the graphical environment, be it GNOME, KDE or one of your own. Since we don't know exactly what manager you are using, we are going to try them all:
/etc/init.d/gdm stop
/etc/init.d/kdm stop
/etc/init.d/xdm stop

One of those will shutdown your graphical session - just remember the one that worked. Now we can get to identifying the partition on which your system is installed. Just type mount, and this command will show the currently mounted disk partitions. The output on my computer looks like this:
/dev/sda3 on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw)
none on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)

We are looking for the specific device names for the root ("/") partition, and any /home or /boot partitions that may exist. Look for lines of this form: /dev/sdXN on /SOME_FOLDER_NAME type FS_TYPE. X will be a letter, usually a or b, and N will be a number. There might not be any lines of that form except the "on /" one that you can see in the output above, and that's just fine, because we will only have to check one partition.
Jot down the device paths (/dev/sdXN), mount points and types (ext2, ext3, ...).
We aren't going to check them right now, because there is a safety precaution we need to take before. Running a filesystem check (and doing repairs on one) while it is being written to by other applications can cause major disk corruption. We are going to prevent that by remounting the partitions in read-only mode. This command will take care of that:
mount /dev/sdXN [target-mount-point] -o remount,ro -t TYPE -f

Of course, you will need to swap the XN part with the letter and number that fit your partition, and the [target-mount-point] with the corresponding path, be it "/", "/boot" or "/home". Let's dissect that command: first, we are calling the mount utility, that binds a partition to a folder and enables access to the files within. Then, we are supplying the device path for the partition we want to (re)mount, the target path where it will be mounted, and the options. The options are vital, because they instruct mount to, well, remount the partition, and to make sure that the partition will be read only, so that no unwanted disk writes will happen. Then, we specify the filesystem type. Finally, the -f option tells mount to apply the changes even if other applications are using files. For my computer, the command will be mount /dev/sda3 / -o remount,ro -t ext4 -f.
After we've put all the safety precautions in place, it's time to check the partition. Back when we issued the mount command to list all the partitons, I asked you to note the type, as it will be needed in this step too. The tool that checks the filesystem is called fsck, but it needs to be know what kind of disk structures to check. That's why you need to call it like this:
fsck.[filesystem-type] -fp /dev/sdXN

You will understand the concept much better if I write down the corresponding command for my computer: fsck.ext4 -fp /dev/sda3. The f parameter tells fsck to check the partition even if it's not marked dirty, and the p one tells it to fix all the errors it encounters. You will be warned that checking a mounted filesystem can cause severe problems, but all the preparing we did before will prevent that.
Now, I know that I've put without rebooting in the title of this post, but if fsck fixed many errors on your drive, you should probably issue a reboot command. Otherwise, we need to mount the partition back in read-write mode. No biggie, we just change one parameter of the mount command for read-only mode:
mount /dev/sdXN [target-mount-point] -o remount,rw -t TYPE -f

That rw instructs the mount utility to enable writing on the partition.
If you did these steps for all your computer's partitions (or just the root partition - "/" - if you have a simple partitioning scheme) and everything is working as it is supposed to be, then it's time to restart the display/window manager. Back when we killed it, we tried a few commands to pinpoint the installed display manager. Now we will need to remember the working command, and swap the stop parameter with a start one. Assuming it was GDM, the command will be:
/etc/init.d/gdm start

Your graphical session should start on terminal 8, so if it doesn't switch automatically, just press CTRL-ALT-F8. That's all, enjoy your cleaned-up partitions.

1 comment:

  1. What an awesome post, I just read it from start to end. Learned something new after a long time.

    SAP SD Training in Chennai