Never use potentially dangerous commands (rm -f) to avoid the prospect of an error!
Instead, one should practice good error handling. For some scripts and depending on the audience "set -e" may be sufficient. But usually always better:
Or, for longer shell scripts, I always include a little function that deals with error handling. You can call it in the same way:
rm file.txt || errorHandler("ignore")
As shown above, the function can be built to take various parameters, and can then for instance abort or ignore the error. It can also take care of updating a log file.
Somewhat related, I do logging in shell scripts with a function, too:
That way, I only need to write the code that writes log lines with nice formating and timestamps and such once.
Somewhat related, one can trap various signals in a shell script, and call a bash function when they happen. Thus, one can trap SIGEXIT and call a cleanup function that triggers on Ctrl-C, and I'm almost certain (sorry, can't test right now) one can also trap SIGERR and catch errors in a shell script nicely (though I'm not sure if that also goes for external commands inside said shell script).
the latter form is preferrable because it is easier and not as prone to concurrency problems.
Although it should be said that concurrency and idempotency is seldom acquired just by using force-flags in isolated spots, the whole operation needs to take it into account. There are probably more steps involved than just removing one file. The linked article leaves out the bigger picture so it's probably not very helpful. Often it is sufficient to construct the new state under a temporary name and then move it in place in one operation.
I tried doing this once. It just wasn't worth the additional testing effort and increased complexity. Switching to Ansible was far more productive.
Interestingly though starting in Bash was a good MVP approach. Fast to get started. But as things started to slow down the project switched to a better tool. I expect for many projects the number of bash scripts may never grow large enough to merit a switch. So, sure write scripts that are safe to run multiple times, but if that effort grows know that there are better tools.
Ansible - in my experience (mostly with openstack-ansible years ago) - was too slow, hard to configure, debug and hard to extend with new modules. :(
Moving toward "declarative state with enforcing/monitoring (idempontent!?) control loops" is much, much better. (What k8s does well. Now all we need is one master bash script to setup k8s :D )
I only remember Python even from the very beginning. Also Cobbler (Michael Dehaan’s prior project) was in Python too.
I could be wrong though, the only reason this stuff sticks in my mind is because i was working on a CFEngine deployment (scarred for life) at the time Ansible popped up.
Even better than Ansible is Saltstack or Puppet or Chef. In other words, something that sits on a controlled machine and already knows facts about said machine
Not all idempotent automation operations need heavy weight centralized state management like a deployed Chef server... Ansible has a lot less boiler plate in many scenarios. Indeed this is exactly why Chef even have 'chef-solo'/local mode to let you avoid having a Chef Server at runtime:
While its a matter of opinion and will depend on problem you are trying to solve, I've found teams get productive on Ansible a lot faster and Chef Cookbooks often end up requiring significant maintenance. I have similar complaints with Puppet too.
My biggest complaint with Chef is that its DSL so closely resembles Ruby that developers often assume the code in the cookbook is executed as pure Ruby rather than a DSL conversion; the transcoding phase in Chef is pretty complex and can lead to lots of debug head scratching.
> Use the -f flag which ignores non-existent files.
> rm -f example.txt
Instead of using --force to clobber all sorts of permission scenarios, wouldn't you want to specifically avoid non-existing files and handle that? I'm thinking something like....
stat -q example.txt && rm example.txt
With a literal filename, rimraf is safe enough, but I wouldn't mind some extra care around `rm -rf $somepath`.
Your idea isn't wrong/bad... But you are aware it creates a potential race condition, right? The `stat` check and the `rm` are two separate operations, with a nonzero time interval between them. So it's entirely possible for another process to delete or rename that file, after the `stat` but before the `rm` can run.
Resolving the race robustly is actually kind of hard, in Bash... You can check whether `rm` returns a nonzero exit status, but that might fail for all sorts of other reasons (e.g., bad permissions). I guess you could case/select the exit status numerically, but that could turn into a real case of bedbugs, really quick.
I guess you could repeat the `stat` check afterwards, if the ~`rm` fails? In that case, if the path doesn't exist afterwards, you can just swallow the `rm` error and let it roll.
Practically speaking, I try to avoid using Bash for anything where automatic, reliable error handling is that important... I love cool Bash tricks, but it's just not designed for sophisticated control flow.
As far as I am aware on Linux there is no way to stat and unlink in one syscall. And a second stat will create a second race condition. If you don’t want the file to exist why are you questioning whether you have permission or not? Just unlink the file via rm, if possible.
> As far as I am aware on Linux there is no way to stat and unlink in one syscall.
That just means you have to do a little more work.
1. Become root.
2. Fork until the process table is full.
3. Kill all other processes. After each kill, fork until the process table is again full.
4. When all processes other than the init process are yours, you can do the stat and unlink without worrying about race conditions since there is no one else to have a race with. (Assuming the file you want to stat and unlink isn't some file that your init process is interested in, and assuming it isn't on a network drive).
The side effects of this are annoying to deal with though and are probably worse than whatever problems an unhandled race condition would cause.
Another approach would be to create a new user, chown the directory containing the file to that user, become that user, chmod the directory to 0300, kill any process that has that file open, do your stat and unlink, chown and chmod the directory back to what they were, and delete the earlier created user.
(Or, if anyone wants to get pedantic and yes, this is ruining the punchline, but alas: shutdown -h now ; then reboot from the USB stick and mount the partition ; rm file.txt).
I believe you are correct about the lack of an atomic stat/unlink operation. Depending on how you implement the two steps, your potential errors are different.
`rm -f` handles some errors in a different way than the stat/rm approach. One can fail where the other would not.
Any given unhandled error may or may not be a problem, depending on the nature of the error, how the rest of the script is structured, and what your requirements are. There's nothing wrong with the stat/rm approach--it may be the better way to go.
> But you are aware it creates a potential race condition, right?
If your script has to be idempotent in the face of concurrent executions of itself, then that's an extra requirement which you have to handle with locking or whatever.
It is not implied by idempotency; there is meaningful idempotency which excludes the concurrency requirement.
If something can rename example.txt in parallel with your script, at any time during its execution, there is nothing you can do to ensure that it's gone.
Whatever step you take to ensure that its is gone can be preceded by the parallel rename.
I'm not addressing idempotency, that's a separate issue. The comparison we're discussing here is about the relative merits of `rm -f` versus the stat/rm (no '-f' option) approach. The parent post addressed a potential weakness of using the '-f' option--and the proposed alternative brings tradeoffs.
I believe your point about idempotency is correct, though.
I doubt that a bash script will be concurrent. If you are running your scripts in paralell, then chances are, that they will be screwed because of some other dependency/side effect.
I would not trust things like `ls -sf` to be thread-safe either.
For shell scripts, a reasonable approach is to use a (correctly implemented) lock file at the start, then a bunch of `test` (aka `[`) blocks to check things. That does a decent job of documenting, too. Vs. everyone who looks at the script needing to know the less-common command line options.
Then, in the bigger picture, make sure that nobody else's script - maybe with its own lock file - starts overlapping with your script.
Yes. I'm not making a case that test gets around that problem, just that in the cases where you want to test something about a file, such as existence, emptiness/non-emptiness, etc, test is the more versatile (and more standard) choice.
maybe, depending on your definition of idempotent. It does update the modification time of the file, which does result in a different state if it is run multiple times. Although in practice, that probably doesn't matter most of the time.
…and then add your task specific rules to your own LOL chain and leave everyone else’s alone.
Much nicer than polluting the top level chains with the iptables equivalent of global variables. You can work on your own but if the firewall without clobbering all the other stuff.
First encountered this concept while writing ansible scripts. Makes quick intuitive sense but actually doing this is a lot harder.
e.g. Approx zero percent of the stackoverflow codesnippets you google are idempotent.
So as a new learner that's a show stopper right there. Barely hanging on to the stuff you're being taught let alone being able to see a big picture concept that is 20x miles above your current level.
I'm helping introduce some DevOps concepts to an ops team that has been doing things in the "traditional" way for a very long time. They've been building scripts in the past couple years to replace what had always been manual work, but it's been interesting to see their perspective. Nothing is idempotent on purpose: they know whether a database exists, whether schema changes are pending, whether a hostname needs to be updated in DNS, so their mindset is simply to not run those scripts if not needed.
I thought the big task would be introducing source control and CI/CD, but I quickly realized idempotentcy is actually the more fundamental and key concept that I need them to embrace.
Unless you can carefully code, it's not worth pretending a script to be idempotent.
The example checks for /mnt/dev in a file but it doesn't check whether the string is in a comment or if it's part of /mnt/develop, and pretending you've achieved idempotency is dangerous unless you've gone through the effort of checking every corner case.
It's easier and safer to pretend it's not idempotent in the first place.
> Unless you can carefully code, it's not worth pretending a script to be idempotent.
Just because you can't reach perfection doesn't mean it's useless to strive towards it.
Idempotency is a very useful property that more shell script writers should be made aware of, even if they only learn basic things like "mkdir -p" or "rf -f SOMEFILE"
Orchestration/install/setup scripts are usually run while nothing else is running. Bash is not ideal for this. But it's not ideal for almost everything. Still making things idempotent-ish is better than bailing on every recoverable error and making the admin do the cleanup manually so the script can finally do what it supposed to do.
(Of course a few years ago immutable infrastructure was the rage, because it means if the script runs once, pack it up as a VM and you're done. And that's how the docker sausage images are made.)
I wrote these years ago. They're damn handy. It's true that they're not implemented in Bash (that would be nuts), but having them on hand lets me do much more on the command line than would otherwise be possible.
~/bin/union
===========
#! /usr/bin/awk -f
!acc[$0]++
~/bin/intersection
==================
#! /usr/bin/awk -f
!buf[$0]++ {acc[$0] += 1}
ENDFILE {
delete buf;
files++
}
END {
for (k in acc) if (acc[k] == files) print k
}
~/bin/set-diff
==============
#! /usr/bin/awk -f
! filenum { acc[$0] = 1 }
filenum { delete acc[$0] }
ENDFILE { filenum++ }
END {
for (k in acc) print k
}
In bash? How would you even implement a set in bash without just doing linear greps? Or did someone add sets to bash 20 years ago and I never got the memo?
You could use the 'look' command, which does a binary search.
It's basically meant to look up spellings in /usr/share/dict/words, but it can work on any file. It will match any line that your pattern is a prefix of, so you'd have to add logic to eliminate longer matches.
But if you had some huge file to search and you wanted to do it from a shell script, that would be one way. Caveat: although it's fairly standard, 'look' might not be installed on every system.
Also, you have to be sure to maintain your file in sorted order. So no adding things by appending to the end, and checking if something is in the set is much quicker at the expense of adding things being much slower.
I agree with you. Bash is a wonderful tool, but it's not very well designed for sophisticated control flow. Gotta know when to say when, and switch to something else.
For a lot of these I think idempotency is the wrong angle to look at it. For example, the problem with mkdir foo is not that it's non-idempotent, it's that (in most cases) the directory already existing is not an error to begin with: it still satisfies the same post-condition you desired. To put it another way, you're usually trying to say "ensure this directory exists", not "ensure you create this directory", and as such, unless you're really trying to test directory creation, mkdir without -p is a semantic bug regardless of whether you need to run it multiple times.
Only Sith deal in absolutes -- I feel that with TDD. Yes, our tests should catch most if not all regressions - no that does not mean we're inherently writing bug-free code as long as it passes all our tests. If someone claims that a methodology is totally necessary, take it with a grain of salt and test your code.
A lot of those are not idempotency. Whenever you check if the file exists before creating it assumes the content of the file that does exist is the same as what you're trying to put in, which is not always true. Same for the mkfs example. Presumably the volume might already have a fs on it that you want to overwrite which is not ext4
Exactly, came here to say this. They should delete the line/file/partition if exists, and append/create/format it from scratch every time in order to respect idempotency.
Another thing I would recommend is usage of temp directories: you can assume they are empty (easier to reason about). Then do all the work there, an when all operations were successful, you "commit" modified files with "mv/cp -r" and overwrite the real files.
I would call the use of temporary files/folders "transactionality" rather than "idempotency" though.
I wish the order of the arguments of ln were easier to get right. One time I did an ln --force with the arguments in the wrong order and ended up ruining all the files. Had to restore from backup.
I always used to think ln's argument order was totally counterintuitive too. Then I realised that they might have done it that way to account for the case where you only give one argument. If you use `ln <targetpath>` without a link path, it will make a link in the current directory with the same name as the target file. This way, in both cases, the target is the first argument.
Maybe find a mnemonic? In college, while I paused to remember the order, a friend blurted out "fake things last", and that phrase has stuck with me. I always hesitate a second but never get it wrong now.
In my opinion, if you feel the need to write bash scripts that mount and/or format drives, there's already something wrong with your approach.
Depending on the environment (old school linux system vs. orchestration suite) I'd rather do it manually, or if possible, let IaC (eg. Terraform, Ansible) handle such infrastructure tasks.
For lots of people, bash is the hammer, everything's a nail. Can be very frustrating to inherit such environments.
I don't want this to sound combative, but maybe you should become more comfortable in bash.
Maybe you don't need portability in your code, but maybe you could use more portability of your skills.
Maybe you can assume that ansible will be on every system you admin, but maybe that will definitely reduce your options and it's possible you don't even know what benefits you are trading by not having the depth.
It's all a big maybe, but maybe some of those people made good trades for the problems they encountered.
Perhaps you conflated /etc/fstab with $HOME/.bash_history? Because for sure that "sudo tee -a" will cheerfully append its stdin to the named file each time it is run, at least until you run out of disk space
That solves reentrancy, when you don't want two copies of a script active at once, but what if you only want to run the lines in a script that weren't already run on last invocation?
write the actual script in a heredoc, and wrap it in a script that pipes it line-by-line into a shell and tees to the lockfile. then, if the script stops, you have a journal of what's executed so far, and the next execution can just skip everything duplicated in the logckfile.
totally foolproof except for the race condition between when your script crashes and you hit up-arrow then enter, someone else with your user account might change it maliciously.
and what do you do if the script stops for some reason? It can be difficult to know what you need to do properly get back into a valid state. Just deleting the lock can cause problems.
I agree, `mktemp` is very useful. I recently worked on a script that would check if you had a repo locally, if so update it, and otherwise clone it for you. `mktemp` made it a lot simpler, as I could just always clone it and be sure that I would have a freshly cloned state.
Bash has the distinction of effectively being one of the only "zero-dependency" cross-platform scripting languages for most software engineers: "just works" on Linux/Macs. On Windows, Git for Windows includes gitbash.exe (for cross platform hook-scripts etc), allowing bash scripts to work on Windows boxes as well.
Given the prevalence of git in the software development world, this has meant in my experience bash has very often been the most effective "zero dependency" cross-platform scripting language around. Python/JS/<other runtimes> will rarely work out of the box 100% of the time especially for Windows developers. This makes bash pretty valuable for things like build scripts if you have developers using different OSes.
If your source lives in git, you know there is a good chance user must have gitbash.exe too if they managed to clone a git repo on Windows, thus granting as near as one gets to "zero-dependency" cross-platform scripting in my experience.
Bash on Mac is deprecated and zsh is sufficiently different. Plus on minimal installs of containers or in VM one may not even have it with /bin/sh provided by dash or similar minimal shell.
So my rule of thumb if one cannot use plain /bin/sh and must depend on bash-specific things, one better use a proper scripting language.
But generally not with stuff that one typically uses for scripting. In most ways, it's a superset of bash.
Besides word-splitting on unquoted variables being turned off with default settings (which is turned on when zsh is called as bash/sh with a symlink or similar), what other differences do you think would be common to find when running random bash scripts with zsh?
This is a really unproductive comment. The bash script versus ‘real scripting language’ is an argument that we will all be fighting over until the end of time, so no need to bring it up here.
The subject is bash scripts, so it’s safe to assume anyone reading the article has decided they have a good reason for implementing the work in bash. As someone who is trying the level up their bash skills, I found the information in this article very useful.
That still doesn't solve the issue that idempotentcy is getting at though. You can very easily and happily write bad python/go/ruby/etc. code that does things like fail to handle a directory that already exists, a file that was previously created, etc. I'd even argue it's more difficult in those language since it's likely more error checking code to write vs. passing a flag to a command in a shell script.
I have a PHP script that does idempotentcy. It was strictly easier to write than a similar code in bash the moment one needs non-trivial and robust patching of config files.
And passing force flags can lead to subtle issues as -f and friends may have wrong behavior in corner cases. So for this reason I do not use it in shell scripts and rather do explicit tests before the command, like test -f file && rm file
Doing so will solve none of the problems outlined in the article. You will still have to write all these patterns in Python to make them idempotent.
Also for systems stuff you’re not gaining much lot when your interface is calling binaries. A script of subprocess.run calls is way more cumbersome and the moment you want to pip install something it’s no longer portable and a huge PITA to distribute.
Bash has lots of footguns because it’s been around a while; run your code through shellcheck if you’re not confident. It’s the lingua franca of Linux userspace.
Plus Bash is one basic scripting language that every sysadmin worth their salt knows, even more than sh.
There will be religious wars until the end of time about Python vs Go vs Ruby vs Rust vs "The New Hotness". Popular languages change; bash, as a shell language, outlives them.
In 1995 Perl was the hot language. Bash was there too. Now people can't maintain Perl code, but Bash still runs.
Bash doesn't require add-on libraries or modules, it uses system built-ins. It's the lowest common denominator for system scripting.
Also, while Mac tries to force people to switch to zsh, and most lemmings just obey, you can still make Macs obey you, and use bash as your login shell.
Bash is the glue that's present everywhere. If you need a script to bootstrap something you're better off writing it in Bash than Python.
If you think Python is the answer then you have to deal with the whole nonsense around pipenv/poetry/pyenv/asdf/virtualenv/setup.py/requirements.txt/pyproject.toml/wheels/... 5 hours later .../system packages/anaconda.
[1] https://github.com/xonixx/makesure
[2] https://github.com/xonixx/makesure#reached_if