Background fsck problems

We had some trouble with our [FreeBSD][] systems over the holiday shutdown a few weeks ago. Our backup generator didn’t kick in during an extended power outage, and our UPS’s didn’t provide enough runtime to see us through. Result: all our systems crashed. (Before anybody mentions it… Yes, I *know* I should have installed [NUT][] and configured our network to shutdown in such cases.)

When I got to the office and started powering systems back up, I noticed the expected messages about the filesystems not being dismounted properly. After all the systems were up, I went back around and started logging in to survey the damage. Here’s an example of what happened when I tried to run [fsck][] on one of our filesystems:

# fsck -y /tmp
** /dev/twed0s1e (NO WRITE)
** Last Mounted on /tmp
** Phase 1 – Check Blocks and Sizes
** Phase 2 – Check Pathnames
** Phase 3 – Check Connectivity

I had never seen “(NO WRITE)” show up on fsck before. A little searching turned up [this post][] which explains that *(NO WRITE)* means the filesystem is mounted, thus fsck cannot write to it. I went back and rebooted each system into single-user mode. Then I was able to fsck all the local filesystems and reboot cleanly.

# init 1
# fsck -y -t ufs
# reboot

[NUT]: “Network UPS Tools”
[this post]: “Message explaining (NO WRITE)”

About Jim Vanderveen

I'm a bit of a Renaissance man, with far too many hobbies for my free time! But more important than any hobby is my family. My proudest accomplishment has been raising some great kids! And somehow convincing my wife to put up with me since 1988. ;)
This entry was posted in FreeBSD, Uncategorized and tagged , , , , . Bookmark the permalink.

3 Responses to Background fsck problems

  1. Joseph Scott says:

    The generator didn’t kick in? Impossible, it always works 🙂

  2. Jim says:

    This afternoon I found out what happened to the generator. We had a scheduled power outage on Dec 26. When the electrician came out here he (helpfully) started up the generator before he cut the main power. But because the power was “on” (due to the generator), the cut-over circuit (from utility power the generator) didn’t trip. The generator was effectively idling; meanwhile, none of the emergency circuits had any power. It took about 20 minutes to fix the problem, but none of our UPS’s have enough battery capacity to run that long.

    On the “plus” side, I finally have approval to implement NUT here! 🙂

  3. Derek Kulinski says:

    Well since long time passed since your blog post this probably is not relevant anymore.
    If during the scan fsck was fixing some inconsistencies (not just removing unused blocks) then you might want to add background_fsck=”NO” to your /etc/rc.conf

    The goal of SoftUpdates is reorganize writes in such order that any power failure will result mostly in unnecessarily allocated blocks and no data ever will be lost. The whole point of background fsck is to reclaim those blocks.

    The problem is with drives which increase performance by reordering writes (most often through write cache) when that happens all hard work done through SoftUpdates is lost and it is possible for drive to be in inconsistent state.

    If that happens and you have background fsck enabled, you might miss any idications that there’s something wrong with filesystem and by using inconsistent FS you might increase chances of data loss…

Leave a Reply

Your email address will not be published. Required fields are marked *