How to write idempotent Bash scripts (2019)

xonix · on Dec 8, 2021

Idempotence is achievable elegantly with makesure tool [1] using @reached_if directive [2]. Full disclosure - I’m the author of the tool.

[1] https://github.com/xonixx/makesure

[2] https://github.com/xonixx/makesure#reached_if

julian_sark · on Dec 9, 2021

This is, in my opinion, REALLY bad practice.

Never use potentially dangerous commands (rm -f) to avoid the prospect of an error!

Instead, one should practice good error handling. For some scripts and depending on the audience "set -e" may be sufficient. But usually always better:

rm file.txt || echo "Warning: Deleting file.txt failed, continuing anyway."

Or, for longer shell scripts, I always include a little function that deals with error handling. You can call it in the same way:

rm file.txt || errorHandler("ignore")

As shown above, the function can be built to take various parameters, and can then for instance abort or ignore the error. It can also take care of updating a log file.

Somewhat related, I do logging in shell scripts with a function, too:

rm file.txt && log("info", "file successfully deleted")

That way, I only need to write the code that writes log lines with nice formating and timestamps and such once.

Somewhat related, one can trap various signals in a shell script, and call a bash function when they happen. Thus, one can trap SIGEXIT and call a cleanup function that triggers on Ctrl-C, and I'm almost certain (sorry, can't test right now) one can also trap SIGERR and catch errors in a shell script nicely (though I'm not sure if that also goes for external commands inside said shell script).

xorcist · on Dec 9, 2021

That's not it.

Between

  [ -f file.txt ] && rm file.txt

and

  rm -f file.txt

the latter form is preferrable because it is easier and not as prone to concurrency problems.

Although it should be said that concurrency and idempotency is seldom acquired just by using force-flags in isolated spots, the whole operation needs to take it into account. There are probably more steps involved than just removing one file. The linked article leaves out the bigger picture so it's probably not very helpful. Often it is sufficient to construct the new state under a temporary name and then move it in place in one operation.

bashonly · on Dec 9, 2021

agreed w/ your general points, but passing arguments to bash functions does not work like this:

    errorHandler("ignore")

it should just be:

    errorHandler "ignore"

julian_sark · on Dec 9, 2021

You are certainly correct, thanks. I hastily typed this out, my bad.

reacweb · on Dec 9, 2021

you will probably love perl. Perl encourages this programming style :

    unlink ('file.txt') or die "Can't remove file.txt: $!";

julian_sark · on Dec 9, 2021

Somewhat paradoxically, my opinion on perl can't be (cleanly) reproduced here ;)

josho · on Dec 8, 2021

I tried doing this once. It just wasn't worth the additional testing effort and increased complexity. Switching to Ansible was far more productive.

Interestingly though starting in Bash was a good MVP approach. Fast to get started. But as things started to slow down the project switched to a better tool. I expect for many projects the number of bash scripts may never grow large enough to merit a switch. So, sure write scripts that are safe to run multiple times, but if that effort grows know that there are better tools.

pas · on Dec 9, 2021

Ansible - in my experience (mostly with openstack-ansible years ago) - was too slow, hard to configure, debug and hard to extend with new modules. :(

Moving toward "declarative state with enforcing/monitoring (idempontent!?) control loops" is much, much better. (What k8s does well. Now all we need is one master bash script to setup k8s :D )

nathancahill · on Dec 9, 2021

I might be misremembering, but I believe Ansible started this way too? And then switched from Bash to Python as the project grew?

zufallsheld · on Dec 9, 2021

Ansible started as a Python peoject according to the first commit:

    commit f31421576b00f0b167cdbe61217c31c21a41ac02
    Author: Michael DeHaan <[email protected]>
    Date:   Thu Feb 23 14:17:24 2012 -0500

        Genesis.

    diff --git a/README.md b/README.md
    new file mode 100644
    index 00000000000..60bbc9f8137
    --- /dev/null
    +++ b/README.md
    @@ -0,0 +1,88 @@
    +Ansible
    +=======
    +
    +Ansible is a extra-simple Python API for doing 'remote things' over SSH.
    +

CraigJPerry · on Dec 9, 2021

I only remember Python even from the very beginning. Also Cobbler (Michael Dehaan’s prior project) was in Python too.

I could be wrong though, the only reason this stuff sticks in my mind is because i was working on a CFEngine deployment (scarred for life) at the time Ansible popped up.

tryauuum · on Dec 9, 2021

Even better than Ansible is Saltstack or Puppet or Chef. In other words, something that sits on a controlled machine and already knows facts about said machine

giobox · on Dec 9, 2021

Not all idempotent automation operations need heavy weight centralized state management like a deployed Chef server... Ansible has a lot less boiler plate in many scenarios. Indeed this is exactly why Chef even have 'chef-solo'/local mode to let you avoid having a Chef Server at runtime:

https://docs.chef.io/chef_solo/

While its a matter of opinion and will depend on problem you are trying to solve, I've found teams get productive on Ansible a lot faster and Chef Cookbooks often end up requiring significant maintenance. I have similar complaints with Puppet too.

My biggest complaint with Chef is that its DSL so closely resembles Ruby that developers often assume the code in the cookbook is executed as pure Ruby rather than a DSL conversion; the transcoding phase in Chef is pretty complex and can lead to lots of debug head scratching.

tryauuum · on Dec 15, 2021

Yeah, my favorite solution of them all is SaltStack. You can literally put jinja2 in any part of state, not into just specific parts like in Ansible.

And it's malleable like clay, you can turn SaltStack into a perfect fit for your case

paulirish · on Dec 8, 2021

> Use the -f flag which ignores non-existent files.

> rm -f example.txt

Instead of using --force to clobber all sorts of permission scenarios, wouldn't you want to specifically avoid non-existing files and handle that? I'm thinking something like....

    stat -q example.txt && rm example.txt

With a literal filename, rimraf is safe enough, but I wouldn't mind some extra care around `rm -rf $somepath`.

SkittyDog · on Dec 9, 2021

Your idea isn't wrong/bad... But you are aware it creates a potential race condition, right? The `stat` check and the `rm` are two separate operations, with a nonzero time interval between them. So it's entirely possible for another process to delete or rename that file, after the `stat` but before the `rm` can run.

Resolving the race robustly is actually kind of hard, in Bash... You can check whether `rm` returns a nonzero exit status, but that might fail for all sorts of other reasons (e.g., bad permissions). I guess you could case/select the exit status numerically, but that could turn into a real case of bedbugs, really quick.

I guess you could repeat the `stat` check afterwards, if the ~`rm` fails? In that case, if the path doesn't exist afterwards, you can just swallow the `rm` error and let it roll.

Practically speaking, I try to avoid using Bash for anything where automatic, reliable error handling is that important... I love cool Bash tricks, but it's just not designed for sophisticated control flow.

IgorPartola · on Dec 9, 2021

As far as I am aware on Linux there is no way to stat and unlink in one syscall. And a second stat will create a second race condition. If you don’t want the file to exist why are you questioning whether you have permission or not? Just unlink the file via rm, if possible.

tzs · on Dec 9, 2021

> As far as I am aware on Linux there is no way to stat and unlink in one syscall.

That just means you have to do a little more work.

1. Become root.

2. Fork until the process table is full.

3. Kill all other processes. After each kill, fork until the process table is again full.

4. When all processes other than the init process are yours, you can do the stat and unlink without worrying about race conditions since there is no one else to have a race with. (Assuming the file you want to stat and unlink isn't some file that your init process is interested in, and assuming it isn't on a network drive).

The side effects of this are annoying to deal with though and are probably worse than whatever problems an unhandled race condition would cause.

Another approach would be to create a new user, chown the directory containing the file to that user, become that user, chmod the directory to 0300, kill any process that has that file open, do your stat and unlink, chown and chmod the directory back to what they were, and delete the earlier created user.

julian_sark · on Dec 9, 2021

That's trying WAY too hard.

I just reboot in single user mode.

(Or, if anyone wants to get pedantic and yes, this is ruining the punchline, but alas: shutdown -h now ; then reboot from the USB stick and mount the partition ; rm file.txt).

SkittyDog · on Dec 9, 2021

I believe you are correct about the lack of an atomic stat/unlink operation. Depending on how you implement the two steps, your potential errors are different.

`rm -f` handles some errors in a different way than the stat/rm approach. One can fail where the other would not.

Any given unhandled error may or may not be a problem, depending on the nature of the error, how the rest of the script is structured, and what your requirements are. There's nothing wrong with the stat/rm approach--it may be the better way to go.

kazinator · on Dec 9, 2021

> But you are aware it creates a potential race condition, right?

If your script has to be idempotent in the face of concurrent executions of itself, then that's an extra requirement which you have to handle with locking or whatever.

It is not implied by idempotency; there is meaningful idempotency which excludes the concurrency requirement.

If something can rename example.txt in parallel with your script, at any time during its execution, there is nothing you can do to ensure that it's gone.

Whatever step you take to ensure that its is gone can be preceded by the parallel rename.

SkittyDog · on Dec 9, 2021

I'm not addressing idempotency, that's a separate issue. The comparison we're discussing here is about the relative merits of `rm -f` versus the stat/rm (no '-f' option) approach. The parent post addressed a potential weakness of using the '-f' option--and the proposed alternative brings tradeoffs.

I believe your point about idempotency is correct, though.

MichaelMoser123 · on Dec 9, 2021

I doubt that a bash script will be concurrent. If you are running your scripts in paralell, then chances are, that they will be screwed because of some other dependency/side effect.

gpderetta · on Dec 9, 2021

Hum parallelism is pervasive in shell scripting.

kbenson · on Dec 9, 2021

I think what you really want is test, which allows you to test various things about a string or variables, including existence, non-zero size, etc.

maest · on Dec 9, 2021

The issue there is that testing then executing can introduce racing issues in a multi threaded scenario (unless I misunderstand how `test` works)

bell-cot · on Dec 9, 2021

I would not trust things like `ls -sf` to be thread-safe either.

For shell scripts, a reasonable approach is to use a (correctly implemented) lock file at the start, then a bunch of `test` (aka `[`) blocks to check things. That does a decent job of documenting, too. Vs. everyone who looks at the script needing to know the less-common command line options.

Then, in the bigger picture, make sure that nobody else's script - maybe with its own lock file - starts overlapping with your script.

Macha · on Dec 9, 2021

This is true for stat too, which was the grandparent example.

Though you're unlikely to be writing multi-threaded bash, but entirely possible for another process to delete it too.

kbenson · on Dec 9, 2021

Yes. I'm not making a case that test gets around that problem, just that in the cases where you want to test something about a file, such as existence, emptiness/non-emptiness, etc, test is the more versatile (and more standard) choice.

kazinator · on Dec 9, 2021

I don't see a -q option documented for the GNU Coreutils stat in currently hosted online documentation.

https://www.gnu.org/software/coreutils/manual/html_node/stat...

(At time of writing, GNU Coreutils 9.0)

thayne · on Dec 9, 2021

> Touch is by default idempotent.

maybe, depending on your definition of idempotent. It does update the modification time of the file, which does result in a different state if it is run multiple times. Although in practice, that probably doesn't matter most of the time.

medstrom · on Dec 9, 2021

I've primarily used touch as a timestamp updater, not a file creator. But to be charitable, it is under the headline "Creating an empty file".

gorgoiler · on Dec 8, 2021

Two other tips: a lot of the ip commands have idempotent versions too. Look for the “replace” versions of each command.

Also, if you have a complex firewall then putting your rules in your own chain and inserting that chain into a top level chain is the idempotent way:

  iptables -D FORWARD -j LOL || true
  iptables -F LOL || true
  iptables -X LOL || true
  iptables -N LOL
  iptables -I FORWARD -j LOL

…and then add your task specific rules to your own LOL chain and leave everyone else’s alone.

Much nicer than polluting the top level chains with the iptables equivalent of global variables. You can work on your own but if the firewall without clobbering all the other stuff.

arminiusreturns · on Dec 9, 2021

thoughts or tips on nftables? (especially with bpf) vs iptables?

gorgoiler · on Dec 9, 2021

As I understand it, nft has the same system/user chains way of organising rules.

Havoc · on Dec 9, 2021

First encountered this concept while writing ansible scripts. Makes quick intuitive sense but actually doing this is a lot harder.

e.g. Approx zero percent of the stackoverflow codesnippets you google are idempotent.

So as a new learner that's a show stopper right there. Barely hanging on to the stuff you're being taught let alone being able to see a big picture concept that is 20x miles above your current level.

Its just not happening

gregmac · on Dec 9, 2021

I'm helping introduce some DevOps concepts to an ops team that has been doing things in the "traditional" way for a very long time. They've been building scripts in the past couple years to replace what had always been manual work, but it's been interesting to see their perspective. Nothing is idempotent on purpose: they know whether a database exists, whether schema changes are pending, whether a hostname needs to be updated in DNS, so their mindset is simply to not run those scripts if not needed.

I thought the big task would be introducing source control and CI/CD, but I quickly realized idempotentcy is actually the more fundamental and key concept that I need them to embrace.

mekster · on Dec 8, 2021

Unless you can carefully code, it's not worth pretending a script to be idempotent.

The example checks for /mnt/dev in a file but it doesn't check whether the string is in a comment or if it's part of /mnt/develop, and pretending you've achieved idempotency is dangerous unless you've gone through the effort of checking every corner case.

It's easier and safer to pretend it's not idempotent in the first place.

AceJohnny2 · on Dec 9, 2021

> Unless you can carefully code, it's not worth pretending a script to be idempotent.

Just because you can't reach perfection doesn't mean it's useless to strive towards it.

Idempotency is a very useful property that more shell script writers should be made aware of, even if they only learn basic things like "mkdir -p" or "rf -f SOMEFILE"

VWWHFSfQ · on Dec 9, 2021

This whole thing is also loaded with TOCTOU problems. It's almost always best just to try to do what you want to do and check for failure.

zymhan · on Dec 9, 2021

I had to look up TOCTOU https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

Very interesting!

User23 · on Dec 9, 2021

That link also describes the Catholic schoolboy algorithm: ask forgiveness not permission.

pas · on Dec 9, 2021

Orchestration/install/setup scripts are usually run while nothing else is running. Bash is not ideal for this. But it's not ideal for almost everything. Still making things idempotent-ish is better than bailing on every recoverable error and making the admin do the cleanup manually so the script can finally do what it supposed to do.

(Of course a few years ago immutable infrastructure was the rage, because it means if the script runs once, pack it up as a VM and you're done. And that's how the docker sausage images are made.)

dzhiurgis · on Dec 9, 2021

Is there a tool that treats txt files as a Set?

Been looking for one a while a go and it feels like it’s something sound be built in.

sa46 · on Dec 9, 2021

You can use a combination of `comm` and `uniq` to implement set intersection and union. https://ss64.com/bash/comm.html

> Return the unique lines in the file words.txt that don't exist in countries.txt

    comm -23 <(sort words.txt | uniq) <(sort countries.txt | uniq)

>Return the lines that are in both words.txt and countries.txt:

    comm -12 <(sort words.txt | uniq) <(sort countries.txt | uniq)

kmstout · on Dec 9, 2021

I wrote these years ago. They're damn handy. It's true that they're not implemented in Bash (that would be nuts), but having them on hand lets me do much more on the command line than would otherwise be possible.

  ~/bin/union
  ===========
  #! /usr/bin/awk -f

  !acc[$0]++


  ~/bin/intersection
  ==================
  #! /usr/bin/awk -f

  !buf[$0]++ {acc[$0] += 1}

  ENDFILE {
    delete buf;
    files++
  }

  END {
    for (k in acc) if (acc[k] == files) print k
  }

  ~/bin/set-diff
  ==============
  #! /usr/bin/awk -f

  ! filenum { acc[$0] = 1    }
  filenum   { delete acc[$0] }

  ENDFILE { filenum++ }

  END {
    for (k in acc) print k
  }

SavantIdiot · on Dec 9, 2021

In bash? How would you even implement a set in bash without just doing linear greps? Or did someone add sets to bash 20 years ago and I never got the memo?

adrianmonk · on Dec 9, 2021

You could use the 'look' command, which does a binary search.

It's basically meant to look up spellings in /usr/share/dict/words, but it can work on any file. It will match any line that your pattern is a prefix of, so you'd have to add logic to eliminate longer matches.

But if you had some huge file to search and you wanted to do it from a shell script, that would be one way. Caveat: although it's fairly standard, 'look' might not be installed on every system.

Also, you have to be sure to maintain your file in sorted order. So no adding things by appending to the end, and checking if something is in the set is much quicker at the expense of adding things being much slower.

User23 · on Dec 9, 2021

When you’re paying six figures careful coding isn’t too much to ask.

SkittyDog · on Dec 9, 2021

I agree with you. Bash is a wonderful tool, but it's not very well designed for sophisticated control flow. Gotta know when to say when, and switch to something else.

gigatexal · on Dec 9, 2021

Do-one-thing-and-only-one-thing-well bash scripts coupled with a Makefile gives one “sophisticated control flow”.

dataflow · on Dec 9, 2021

For a lot of these I think idempotency is the wrong angle to look at it. For example, the problem with mkdir foo is not that it's non-idempotent, it's that (in most cases) the directory already existing is not an error to begin with: it still satisfies the same post-condition you desired. To put it another way, you're usually trying to say "ensure this directory exists", not "ensure you create this directory", and as such, unless you're really trying to test directory creation, mkdir without -p is a semantic bug regardless of whether you need to run it multiple times.

joebob42 · on Dec 9, 2021

I found "Good software is always written in an idempotent way" sort of hilarious.

Obviously idempotency is a powerful tool, but the confidence with which tfa demands it of all software was silly.

kody · on Dec 9, 2021

Only Sith deal in absolutes -- I feel that with TDD. Yes, our tests should catch most if not all regressions - no that does not mean we're inherently writing bug-free code as long as it passes all our tests. If someone claims that a methodology is totally necessary, take it with a grain of salt and test your code.

emmelaich · on Dec 9, 2021

There are a few dark corners, e.g.

   mkdir dir
   chmod a-x dir
   mkdir -p dir

Will not give you the permissions on dir that you want, typically.

I've come to the idea that the best thing to guarantee what you want is

  mkdir the-new-dir.temp-jnsjw23
  <do things to the-new-dir.temp-jnsjw23>
  rsync -a --delete the-new-dir.temp-jnsjw23 the-new-dir

teddyh · on Dec 9, 2021

Instead of “mkdir -p dir”, use “install --directory -- dir”.

emmelaich · on Dec 13, 2021

Yep, or use the '-m' (mode) option with mkdir.

mmgutz · on Dec 8, 2021

I have used every trick in this article over the years. Nice to have them in one place. Will save many hours of googling.

pkrumins · on Dec 9, 2021

100% agree, this is the best article on writing great bash scripts.

kazinator · on Dec 9, 2021

Reproducible builds are a form of idempotency. You run the same build twice, and get exactly the same artifacts.

The effort to get there also shows you how elusive idempotency can be, especially if strictly defined.

totony · on Dec 9, 2021

A lot of those are not idempotency. Whenever you check if the file exists before creating it assumes the content of the file that does exist is the same as what you're trying to put in, which is not always true. Same for the mkfs example. Presumably the volume might already have a fs on it that you want to overwrite which is not ext4

sscarduzio · on Dec 9, 2021

Exactly, came here to say this. They should delete the line/file/partition if exists, and append/create/format it from scratch every time in order to respect idempotency.

Another thing I would recommend is usage of temp directories: you can assume they are empty (easier to reason about). Then do all the work there, an when all operations were successful, you "commit" modified files with "mv/cp -r" and overwrite the real files.

I would call the use of temporary files/folders "transactionality" rather than "idempotency" though.

ufo · on Dec 8, 2021

I wish the order of the arguments of ln were easier to get right. One time I did an ln --force with the arguments in the wrong order and ended up ruining all the files. Had to restore from backup.

pkrumins · on Dec 9, 2021

The order of LN arguments is #1 gotcha in my opinion when working in the shell.

I was confusing the order of LN arguments for years until I read a pro tip that they are the same order as CP arguments.

   cp file1 file2

Creates file2 that is a copy of file1.

And now

   ln -s file1 file2

Creates file2 that links to file1.

No more confusion.

pavon · on Dec 9, 2021

Another way to avoid confusion over ln argument order is to use cp itself:

    cp --symbolic-link file1 file2
    cp -s file1 file2

pkrumins · on Dec 9, 2021

Wow, TIL.

superluserdo · on Dec 9, 2021

I always used to think ln's argument order was totally counterintuitive too. Then I realised that they might have done it that way to account for the case where you only give one argument. If you use `ln <targetpath>` without a link path, it will make a link in the current directory with the same name as the target file. This way, in both cases, the target is the first argument.

foodevl · on Dec 9, 2021

Maybe find a mnemonic? In college, while I paused to remember the order, a friend blurted out "fake things last", and that phrase has stuck with me. I always hesitate a second but never get it wrong now.

strzibny · on Dec 9, 2021

I use Bash in an idempotent fashion so that my readers can rerun the demonstrations of Deployment from Scratch (or even just small part of them).

Apart from doing what the article suggests, I make sure that every step of a process is rerunnable by cleaning the state beforehand.

E.g.

rm -rf /path/to/dir mkdir -p /path/to/dir # use the dir now

If a command doesn't have the right switch to handle both cases (like using semanage), I use || to fallback to another command.

It feels pretty manual compared to some declarative approach, but it works nicely in small Bash scripts.

lazyweb · on Dec 9, 2021

In my opinion, if you feel the need to write bash scripts that mount and/or format drives, there's already something wrong with your approach. Depending on the environment (old school linux system vs. orchestration suite) I'd rather do it manually, or if possible, let IaC (eg. Terraform, Ansible) handle such infrastructure tasks.

For lots of people, bash is the hammer, everything's a nail. Can be very frustrating to inherit such environments.

SixDouble5321 · on Dec 11, 2021

I don't want this to sound combative, but maybe you should become more comfortable in bash.

Maybe you don't need portability in your code, but maybe you could use more portability of your skills.

Maybe you can assume that ansible will be on every system you admin, but maybe that will definitely reduce your options and it's possible you don't even know what benefits you are trading by not having the depth.

It's all a big maybe, but maybe some of those people made good trades for the problems they encountered.

soheil · on Dec 9, 2021

> echo "/dev/sda1 /mnt/dev ext4 defaults 0 0" | sudo tee -a /etc/fstab

> If this is run again, you’ll end up having duplicate entries in your /etc/fstab

You could also just add a space in front of a command you don't to accidentally run later. This will not store it in .history

mdaniel · on Dec 9, 2021

Perhaps you conflated /etc/fstab with $HOME/.bash_history? Because for sure that "sudo tee -a" will cheerfully append its stdin to the named file each time it is run, at least until you run out of disk space

ttyprintk · on Dec 9, 2021

Seems to me findmnt has the proper helpers for checking. But, I would recommend a systemd mount unit.

gleventhal · on Dec 8, 2021

Or just have your script make a lock file somewhere on a persistent file system and exit if it exists.

tenebrisalietum · on Dec 8, 2021

That solves reentrancy, when you don't want two copies of a script active at once, but what if you only want to run the lines in a script that weren't already run on last invocation?

ruined · on Dec 9, 2021

write the actual script in a heredoc, and wrap it in a script that pipes it line-by-line into a shell and tees to the lockfile. then, if the script stops, you have a journal of what's executed so far, and the next execution can just skip everything duplicated in the logckfile.

totally foolproof except for the race condition between when your script crashes and you hit up-arrow then enter, someone else with your user account might change it maliciously.

charcircuit · on Dec 8, 2021

and what do you do if the script stops for some reason? It can be difficult to know what you need to do properly get back into a valid state. Just deleting the lock can cause problems.

74B5 · on Dec 9, 2021

I am a little disappointed that mktemp as an intermediate is not mentioned.

fyhn · on Dec 9, 2021

I agree, `mktemp` is very useful. I recently worked on a script that would check if you had a repo locally, if so update it, and otherwise clone it for you. `mktemp` made it a lot simpler, as I could just always clone it and be sure that I would have a freshly cloned state.

enriquto · on Dec 9, 2021

Somebody should tell this person about GNU Make.

pkrumins · on Dec 9, 2021

This is one of the best articles I have read on writing good bash scripts.

itamarst · on Dec 8, 2021

A better approach is to avoid writing bash scripts. They're just too dangerous, even with "strict mode" (http://redsymbol.net/articles/unofficial-bash-strict-mode/).

Better to use Python, or Ruby, or a Rust or Go executable.

giobox · on Dec 8, 2021

Bash has the distinction of effectively being one of the only "zero-dependency" cross-platform scripting languages for most software engineers: "just works" on Linux/Macs. On Windows, Git for Windows includes gitbash.exe (for cross platform hook-scripts etc), allowing bash scripts to work on Windows boxes as well.

Given the prevalence of git in the software development world, this has meant in my experience bash has very often been the most effective "zero dependency" cross-platform scripting language around. Python/JS/<other runtimes> will rarely work out of the box 100% of the time especially for Windows developers. This makes bash pretty valuable for things like build scripts if you have developers using different OSes.

If your source lives in git, you know there is a good chance user must have gitbash.exe too if they managed to clone a git repo on Windows, thus granting as near as one gets to "zero-dependency" cross-platform scripting in my experience.

EuAndreh · on Dec 8, 2021

The sh shell, or POSIX sh, is closer to a zero dependency language. There are many systems that have shells which are not bash.

giobox · on Dec 8, 2021

Excellent point, and will still work on recent MacOS versions with zsh/gitbash.exe too.

_0w8t · on Dec 8, 2021

Bash on Mac is deprecated and zsh is sufficiently different. Plus on minimal installs of containers or in VM one may not even have it with /bin/sh provided by dash or similar minimal shell.

So my rule of thumb if one cannot use plain /bin/sh and must depend on bash-specific things, one better use a proper scripting language.

jolmg · on Dec 8, 2021

> and zsh is sufficiently different

But generally not with stuff that one typically uses for scripting. In most ways, it's a superset of bash.

Besides word-splitting on unquoted variables being turned off with default settings (which is turned on when zsh is called as bash/sh with a symlink or similar), what other differences do you think would be common to find when running random bash scripts with zsh?

cachvico · on Dec 9, 2021

Built-in bash 3 on a Mac is deprecated. But you can use Homebrew to install Bash 5 and make it the new default. Which is what many developers do.

40four · on Dec 8, 2021

This is a really unproductive comment. The bash script versus ‘real scripting language’ is an argument that we will all be fighting over until the end of time, so no need to bring it up here.

The subject is bash scripts, so it’s safe to assume anyone reading the article has decided they have a good reason for implementing the work in bash. As someone who is trying the level up their bash skills, I found the information in this article very useful.

qbasic_forever · on Dec 8, 2021

That still doesn't solve the issue that idempotentcy is getting at though. You can very easily and happily write bad python/go/ruby/etc. code that does things like fail to handle a directory that already exists, a file that was previously created, etc. I'd even argue it's more difficult in those language since it's likely more error checking code to write vs. passing a flag to a command in a shell script.

_0w8t · on Dec 8, 2021

I have a PHP script that does idempotentcy. It was strictly easier to write than a similar code in bash the moment one needs non-trivial and robust patching of config files.

And passing force flags can lead to subtle issues as -f and friends may have wrong behavior in corner cases. So for this reason I do not use it in shell scripts and rather do explicit tests before the command, like test -f file && rm file

Spivak · on Dec 8, 2021

Doing so will solve none of the problems outlined in the article. You will still have to write all these patterns in Python to make them idempotent.

Also for systems stuff you’re not gaining much lot when your interface is calling binaries. A script of subprocess.run calls is way more cumbersome and the moment you want to pip install something it’s no longer portable and a huge PITA to distribute.

Bash has lots of footguns because it’s been around a while; run your code through shellcheck if you’re not confident. It’s the lingua franca of Linux userspace.

geekbird · on Dec 8, 2021

Plus Bash is one basic scripting language that every sysadmin worth their salt knows, even more than sh.

There will be religious wars until the end of time about Python vs Go vs Ruby vs Rust vs "The New Hotness". Popular languages change; bash, as a shell language, outlives them.

In 1995 Perl was the hot language. Bash was there too. Now people can't maintain Perl code, but Bash still runs. Bash doesn't require add-on libraries or modules, it uses system built-ins. It's the lowest common denominator for system scripting.

Also, while Mac tries to force people to switch to zsh, and most lemmings just obey, you can still make Macs obey you, and use bash as your login shell.

quesera · on Dec 9, 2021

> while Mac tries to force people to switch to zsh, and most lemmings just obey

You have a flawed understanding of Mac users, and of lemmings, and of zsh.

_dain_ · on Dec 8, 2021

>it uses system built-ins

... which themselves differ by system.

aught · on Dec 9, 2021

“…up to the labeling”

xdennis · on Dec 9, 2021

Bash is the glue that's present everywhere. If you need a script to bootstrap something you're better off writing it in Bash than Python.

If you think Python is the answer then you have to deal with the whole nonsense around pipenv/poetry/pyenv/asdf/virtualenv/setup.py/requirements.txt/pyproject.toml/wheels/... 5 hours later .../system packages/anaconda.

seiferteric · on Dec 8, 2021

How does that help with idempotency? Your better off with Ansible if idempotency is the goal.

znpy · on Dec 8, 2021

> How does that help with idempotency?

It just doesn't.

Edit: oh and you can write non-idempotent ansible playbooks too... it's not about the tool, it's about the craft, and experience.