What is fsck up to now?

saagarjha · on March 17, 2020

One of my favorite BSD features is SIGINFO, which is intended for applications to give some sort of information about what they’re currently doing. If you’re on macOS, I know some of the copying commands (such as dd) implement it, and a ^T in your terminal will tell you how far along it is.

masklinn · on March 17, 2020

SIGINFO is awesome, I've no idea why the linux world refuses to copy its semantics.

kbenson · on March 17, 2020

While named somewhat poorly for what it is sometimes used for SIGQUIT can be used for this. Ping uses it for statistics reporting in the middle of pinging. You can use Ctrl+\ in the terminal to send it. E.g.

    $ ping google.com
    PING google.com (172.217.5.110) 56(84) bytes of data.
    64 bytes from sfo03s07-in-f110.1e100.net (172.217.5.110): icmp_seq=1 ttl=54 time=4.04 ms
    64 bytes from sfo03s07-in-f110.1e100.net (172.217.5.110): icmp_seq=2 ttl=54 time=4.04 ms
    2/2 packets, 0% loss, min/avg/ewma/max = 4.037/4.039/4.037/4.042 ms
    64 bytes from sfo03s07-in-f110.1e100.net (172.217.5.110): icmp_seq=3 ttl=54 time=4.16 ms
    64 bytes from sfo03s07-in-f110.1e100.net (172.217.5.110): icmp_seq=4 ttl=54 time=4.06 ms
    4/4 packets, 0% loss, min/avg/ewma/max = 4.037/4.076/4.054/4.164 ms
    64 bytes from sfo03s07-in-f110.1e100.net (172.217.5.110): icmp_seq=5 ttl=54 time=4.19 ms
    64 bytes from sfo03s07-in-f110.1e100.net (172.217.5.110): icmp_seq=6 ttl=54 time=4.20 ms
    ^C
    --- google.com ping statistics ---
    6 packets transmitted, 6 received, 0% packet loss, time 11ms
    rtt min/avg/max/mdev = 4.037/4.114/4.195/0.068 ms

_ytji · on March 17, 2020

I was thinking it was SIGINFO, but this reminded me of the SIGUSR1 trick to get the current status of a running dd process-

  $ dd if=/dev/zero of=/dev/null& pid=$!
  $ kill -USR1 $pid; sleep 1; kill $pid
  
  18335302+0 records in 18335302+0 records out 9387674624 bytes (9.4 GB) copied,  34.6279  seconds, 271 MB/s

dntbnmpls · on March 18, 2020

That's interesting. I always wonder how people monitored the progress of dd before it added the status option.

Now you can use the status option to get a realtime update of the progress.

dd if=/dev/urandom of=/dev/null status=progress

masklinn · on March 18, 2020

It is siginfo on BSDs, sigusr1 is the fallback.

JdeBP · on March 17, 2020

There have been attempts to do so over the years. https://lkml.org/lkml/2019/6/5/174 was one. I haven't looked to see what became of them.

saagarjha · on March 17, 2020

Signal numbers are a non-renewable resource, perhaps they think it'll cut into their valuable reserves ;)

war1025 · on March 17, 2020

I have used SIGUSR1 and SIGUSR2 for this purpose at work.

masklinn · on March 17, 2020

The problem of SIGUSR is it has no defined semantics and defaults to terminating the application, so you can't just throw SIGUSRs at random process, you need to know somehow that the process does something useful. Furthermore I don't think sigusrs have a control code.

SIGINFO is ignored by default, and pretty clearly an info-dump trigger, so you can throw ^T at any random utility you're running, worst case scenario is you'll just get a time-type dump.

steerablesafe · on March 17, 2020

Huh, AFAIK on linux dd responds to SIGUSR1.

naniwaduni · on March 17, 2020

The trouble with SIGUSR1 is that the default action is terminate, so you can only send it to processes you know will take it well.

It's safe to send SIGINFO to processes that don't know about it; the default action is to ignore it. You can send it to an entire process group, and maybe some of them will answer it. But even if they don't, they won't just get killed for it.

This makes it so much more useful and discoverable, since you can almost always ^T with little risk. Usually you'll get at least a bit of information about how long the current command (more or less) has been running. If the running program happens to handle it, so much the better.

JNRowe · on March 17, 2020

On Linux you can use progress¹ for many of these use cases. By default it scans for running processes that you might want to know about, but you can also ask it tell you about a PID with -p. It supports a -m[onitor] mode to report status until the command exits, and features some basic filtering options to ignore certain files.

You can also manually dig about in /proc/$pid/fd{,info}/ if you want something more fancy, like using gdbar² to display a graphical progress through files for a given process.

1. https://github.com/Xfennec/progress

2. https://github.com/robm/dzen

tetris11 · on March 17, 2020

Its good but it suffers from the same slowdowns as htop when there are multiple operations happening as it crawls through /proc

JNRowe · on March 17, 2020

Yeah, I'd recommend using the `-p $pid` option when you can. Not just because it doesn't need to scan all of /proc, like the -c[ommand] or default mode do, but also because it doesn't suddenly start listing other processes when you're in monitor mode.

That said, sometimes it is nice to see the other commands pop up in monitor mode. For example, when the rate suddenly drops in a command that you care about then the other output will often show the culprits for you to `kill -STOP`.

saagarjha · on March 17, 2020

https://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;...

voxadam · on March 17, 2020

https://stuff-things.net/2016/04/06/that-one-stupid-dd-trick...

bobbylarrybobby · on March 18, 2020

This is absolutely fantastic; thank you for teaching me about this. Seems like the default output is

    load: {load%}  cmd: {cmd name} {PID} running {user time}u {system time}s

Being able to grab the PID from a currently running process -- in the same shell it's running in! -- is priceless on its own. The rest is icing on the cake.

tonyarkles · on March 18, 2020

Wow! I have been a Mac user since PPC was a thing, and I had no idea! Thank you!

jolmg · on March 17, 2020

Does cp support it (on a BSD)?

saagarjha · on March 17, 2020

macOS's does.

monitron · on March 17, 2020

This is cool. But it seems to me that this person's setup could benefit from a way for the computers to be notified that battery power would run out soon so that they could shut down cleanly. I have that going at home through a USB connection to a UPS...it would be harder to set it up with a central battery but they seem to be up to fun challenges :)

amenonsen · on March 17, 2020

(I'm this person.) It so happens that the solar inverter I'm using right now doesn't provide a data connection that I could use to shut down the computers cleanly.

But I should also clarify that my long-running fsck isn't always the result of an unclean shutdown. There's something about my combination of iSCSI+crypttab+NFS that causes fsck to be run too often—even if I shut down the machine cleanly while the NAS is running, it usually decides to fsck when it comes back up.

Something to investigate next winter, perhaps.

monitron · on March 17, 2020

Hi :) Thanks for the awesome write-up. I understand...I have a few devices in my house that resist all attempts at integration. I have considered doing something totally ridiculous like setting up a Raspberry Pi with a camera and machine vision software to watch the LED displays on these devices to glean status information. Silly, but...

Interesting about fsck running for unclear reasons. "What is it up to now?" is a valid question at multiple levels!

amenonsen · on March 17, 2020

> Thanks for the awesome write-up.

Glad you enjoyed it. :-)

> I have considered doing something totally ridiculous like setting up a Raspberry Pi with a camera and machine vision software to watch the LED displays on these devices to glean status information. Silly, but...

Ha! I actually have a PoE camera pointed at the display of my UPS. Here's what it looks like right now: https://toroid.org/misc/ups-display.jpg

Notice that horizontal blank row of dead pixels halfway down the right side of the display? The one that makes "54.6" look like "51.6"? That gap defeated my naïve five-minute attempt to use image recognition to extract the battery voltage.

all2 · on March 17, 2020

I would be tempted to tie the LED lines to digital ins on an Arduino. I'm betting some finagling with the display data lines could get you the information from the display, as well.

Depending on how the cabling to that display works, you might be able to do all of that without having to disable the display.

dfc · on March 17, 2020

Whats up with the "[f]" in your grep command?

    grep '[f]sck'

geraldcombs · on March 17, 2020

It keeps the grep command itself from showing up in the output.

linsomniac · on March 17, 2020

Ooh, that's very cute! I usually throw a "| grep -v grep" on the end, but I'm gonna try to remember this.

JdeBP · on March 17, 2020

Better to remember pgrep, instead of an incremental change that you'll find still has problem cases (such as matching usernames).

* http://mywiki.wooledge.org/ProcessManagement#But_I.27m_on_so...

M. Wooledge's description of ps options is not quite accurate, but that is incidental to xyr main point.

mrguyorama · on March 17, 2020

You could run a cheap/small UPS off your main stores maybe? That could possibly provide the signal and management

JdeBP · on March 17, 2020

Enjoy some more manual pages in the same vein:

* http://jdebp.uk./Softwares/nosh/guide/commands/monitored-fsc...

* http://jdebp.uk./Softwares/nosh/guide/commands/monitor-fsck-...

And a service:

    % system-control print-service-scripts monitor-fsck-progress
    start:#!/bin/nosh
    start:true
    stop:#!/bin/nosh
    stop:true
    run:#!/bin/nosh
    run:#local socket used for monitor-fsck-progress
    run:local-stream-socket-listen --systemd-compatibility --backlog 2 --mode 0644 /run/fsck.progress
    run:setsid
    run:setlogin -- daemon
    run:vc-get-tty console
    run:fdmove -c 4 2
    run:open-controlling-tty
    run:fdmove 2 4
    run:setuidgid -- daemon
    run:./service
    service:#!/bin/nosh
    service:#fsck combined progress information displayed on /dev/console
    service:monitor-fsck-progress
    restart:#!/bin/sh
    restart:exec false      # ignore script arguments
    %

phaemon · on March 17, 2020

Don't usually care about titles but surely, "What, the fsck, have you done?"

dhosek · on March 17, 2020

I've always figured that fsck is called that because they couldn't come up with an acronym with u.

saagarjha · on March 17, 2020

Filesystem Uniformity ChecK isn't a huge reach…

amenonsen · on March 18, 2020

https://groups.google.com/forum/#!msg/alt.sysadmin.recovery/...

knorker · on March 17, 2020

Ugh, the more I see of systemd the worse it looks. What a mess of a "design" we see a glimpse of here.

hello_tyler · on March 19, 2020

Did he ever get to fsck ? Or did it just hang the entire time ?

ck2 · on March 17, 2020

     killall -USR1 e2fsck

or start it with -C

JdeBP · on March 17, 2020

systemd-fsck, as the article outright told you, already starts it with -C .