There are two types of github actions workflows you can build.
1) Program with github actions. Google "how can I send an email with github actions?" and then plug in some marketplace tool to do it. Your workflows grow to 500-1000 lines and start having all sorts of nonsense like conditionals and the YAML becomes disgusting and hard to understand. Github actions becomes a nightmare and you've invited vendor lock in.
2) Configure with github actions. Always ask yourself "can I push this YAML complexity into a script?" and do it if you can. Send an email? Yes, that can go in a script. Your workflow ends up being about 50-60 lines as a result and very rarely needs to be changed once you've set up. Github actions is suddenly fine and you rarely have to do that stupid push-debug-commit loop because you can debug the script locally.
Every time I join a new team I tell them that 1 is the way to madness and 2 is the sensible approach and they always tepidly agree with me and yet about half of the time they still do 1.
The thing is, the lack of debugging tools provided by Microsoft is also really not much of a problem if you do 2, vendor lock in is lower if you do 2, debugging is easier if you do 2 but still nobody does 2.
This is a great perspective, and one I agree with -- many of the woes associated with GitHub Actions can be eliminated by treating it just as a task substrate, and not trying to program in YAML.
At the same time, I've found that it often isn't sufficient to push everything into a proper programming language: I do sometimes (even frequently) need to use vendor-specific functionality in GHA, mark dependencies between jobs, invoke REST APIs that are already well abstracted as actions, etc. Re-implementing those things in a programming language of my choice is possible, but doesn't break the vendor dependency and is (IME) still brittle.
Essentially: the vendor lock-in value proposition for GHA is very, very strong. Convincing people that they should take option (2) means making a stronger value proposition, which is pretty hard!
No, you're right it's not necessarily a good idea to be anal about this rule. E.g. If an action is simple to use and already built I use it - I won't necessarily try to reimplement e.g. upload artifacts step in code.
Another thing I noticed is that if you do 1 sophisticated features like build caching and parallelization often becomes completely impractical whereas if you default to 2 you can probably do it with only a moderate amount of commit-push-debug.
Option 2 also makes it easier for developers to run their builds locally, so you're essentially using the same build chain for local debugging than you do for your Test/Staging/Prod environments, instead of maintaining two different build processes.
It's not just true for GHA, but for any build server really: The build server should be a script runner that adds history, artifact management, and permissions/auditing, but should delegate the actual build process to the repository it's building.
Good perspective. Unfortunately (1) is unavoidable when you're trying to automate GH itself (role assignments, tagging, etc.). But at this point, I would rather handle a lot of that manually than deal with GHA's awful debug loop.
FWIW, there's nektos/act[^1], which aims to duplicate GHA behavior locally, but I haven't tried it yet.
> Unfortunately (1) is unavoidable when you're trying to automate GH itself (role assignments, tagging, etc.)
Can't you just use the Github API for that? The script would be triggered by the YAML, but all logic is inside the script.
But `act` is cool, I've used it for local debugging. Thing is its output is impossibly verbose, and they don't aim to support everything an action does (which is fine if you stick to (2)).
Yeah, I've done quite a bit of Github scripting via octokit and it's pretty simple. Using GHA's built-in functionality might turn a five line script into a one-liner, but I think being able to run the script directly is well worth the tradeoff.
The main thing that you can't decouple from GHA is pushing and pulling intermediate artifacts, which for some build pipelines is going to be a pretty big chunk of the logic.
How DO you debug your actions? I spend so long in the commit-action-debug-change loop it’s absurd. I agree with your point re: 2 wholeheartedly though, it makes debugging scripts so much easier too. CI should be runnable locally and GitHub actions, while supported with some tooling, still isn’t very easy to work with like that.
We may be splitting hairs given what this thread is going on about, but I strongly advocate for `--force-with-lease` as a sane default versus `-f` so that one does not blow away unexpectedly newer commits to the branch
The devil's in the details, etc, etc, but I think it's a _much_ more sane default, even for single-user setups/branches because accidents can happen and git DGAF
Act works pretty well to debug actions locally. It isn't perfect, but I find it handles about 90% of the write-test-repeat loop and therefore saves my teammates from dozens of tiny test PRs.
May have misread this but you know you can push to one branch and then run the action against it? Would reduce PRs if you're doing that to then check the action in master. You have to add a workflow_dispatch to the action: https://docs.github.com/en/actions/using-workflows/manually-...
Yeah most of the time that is a good way to test. There are some specific actions that aren't easily tested outside of the regular spot though. Mostly deployment related pieces due to the way our infrastructure is setup.
The main reason I aim for (2) is that I want to be able to drive my build locally if and when GitHub is down, and I want to be able to migrate away easily if I ever need to.
I think of it like this:
I write scripts (as portable as possible) to be able to build/test/sign/deploy/etc
They should work locally always.
GitHub is for automating me setting up the environments where I can run those scripts and then actually running them.
Totally get what you're saying. I once switched our workflow to trigger on PRs to make testing easier. Now, I'm all about using scripts — they're just simpler to test and fix.
I recommend making these scripts cross-platform for flexibility. Use matrix: and env: to handle it. Go for Perl, JavaScript, or Python over OS shells and put file tasks in scripts to dodge path issues.
I've tried boxing these scripts into steps, but unless they're super generic for everyone, it doesn't seem worth it.
They don't seem to grasp how bad their setup is, and consequently are willing to understand awful programming conditions. Even punch cards were better as these people had the advantage of working with a real programming language with defined behaviour. "when exactly is this string interpolation step executed? in the anchor or when referenced? (well, it depends)". No it's black box tinkering
(you might as well be prompt engineering)
the C in IaC is supposed to stand for code. Well, if you're supposed to code something you need to
- be able to assert correctness before you commit,
- be able to step through the code
If the setup they give you doesn't even have these minimal requirements you're going to be in trouble regardless of how brilliant an engineer you are.
I agree overall, but you oversimplify the issue a bit.
> can I push this YAML complexity into a script?
- what language is the script written in?
- will developers use the same language for all those scripts?
- does it need dependencies?
- where are we going to host scripts used by multiple github actions?
- if we ended up putting those scripts in repositories, how do we update the actions once we release new version of the scripts?
- how do you track those versions?
- how much does it cost to write a separate script and maintain it versus locking us in with an external github action?
These are just the first questions that pop in my mind, but there is more. And some answers may not be that difficult, yet is still something to think about.
And I agree with the core idea (move logic outside pipeline configuration), but I can understand the tepid reaction you may get. Is not free and you compromise on some things
I think they framed it accurately and you are instead over complicating. Language for scripts is a decision that virtually every team ends up making regardless. The other questions are basically all irrelevant since the scripts and actions are both stored in repo, and therefore released together and versioned together.
I think the point about maintenance cost is valid, but the thesis of the comment that you are responding to is that the prebuilt actions are a complexity trap.
I think you are still envisioning a fundamentally incorrect approach. Build scripts for a project are part of that project, not some external thing. The scripts are stored in the repository, and pulled from the branch being built. Dependencies for your build scripts aren't any different from any other build-time dependencies for your project.
I have a few open source projects that have lasted for 10+ years, and I can’t agree more with approach #2.
Ideally you want your scripting to handle of the weird gotchas of different versions of host OSes, etc. Granted my work is cross-platform so it is compounded.
So far I’ve found relying on extensive custom tooling has allowed me to handle transitions from local, to Travis, to AppVeyor, to CircleCI and now also GitHub Actions.
You really want your CI config to specify the host platform and possibly set some env vars. Then it should invoke a single CI wrapper script. Ideally this can also be run locally.
There’s a curve. Stringy, declarative DSLs have high utility when used in linear, unconditional, stateless programming contexts.
Adding state?
Adding conditionals?
Adding (more than a couple) procedure calls?
These concepts perform poorly without common programming tools: testing (via compilation or development runtime), static analysis, intellisense, etc etc
Imagine the curve:
X axis is (vaguely) LinesOfYaml (lines of dsl, really)
Y axis is tool selection. Positive region of axis is “use a DSL”, lower region is “use a GeneralPurposeProgrammingLanguage”
The line starts at the origin, has a SMALL positive bump, than plummets downwards near vertically.
Gets it right? Tools like ocurrent (contrasted against GH actions) [1], cdk (contrasted against TF yaml) [2]
Gets it wrong? Well, see parent post. This made me so crazy at work (where seemingly everyone has been drinking the yaml dsl koolaide) that i built a local product simulator and yaml generator for their systems because “coding” against the product was so untenable.
Your advice is sane and I can tell speaks from experience. Unfortunately, now that Github Actions are being exposed through Visual Studio, I fear that we are going to see an explosion of number 1, just because the process is going to be more disconnected from Github itself (no documentation or Github UI visible while working within Visual Studio).
I try to do (2), but I still run into annoyances. Like I'll write a script to do some part of my release process. But then I start a new project, and realize I need that script, so I copy it into the new repo. Then I fix a bug in that script, or ad some new functionality, and I need to go and update the script in the other repo too.
Maybe this means I should encapsulate this into an action, and check it in somewhere else. But I don't really feel like that; an action is a lot of overhead for a 20-line bash script. Not to mention that erases the lack of lock-in that the script alone gives me.
I guess I could check the script into a separate utility repo, and pull it into my other repos via git submodules? That's probably the least-bad solution. I'd still have to update the submodule refs when I make changes, but that's better than copy-pasting the scripts everywhere.
I agree, but of course all CI vendors build all their documentation and tutorials and 'best practices' 100% on the first option for lock-in and to get you to use more of their ecosystem, like expensive caching and parallel runners. Many github actions and circleci orbs could be replaced by few lines of shell script.
Independent tutorials unfortunately fall in the same bucket as they first look at official documentation to try to follow so-called best practices or just try to get their things working, and I would say also because shell scripts will seem more hacky for many people -unfairly-.
That's true for all CI services, do as little as possible in yaml, mostly just use it to start your own scripts, for the scripts use something like python or deno to cover Linux, Mac and Windows environments with the same code.
When GitHub actions came out, I felt bad about myself because I had no desire to learn their new programming language of breaking everything down into multiple small GitHub actions.
I think you explained quite well what I couldn't put my finger on last time:
Building every simple workflow out of a pile of 3rd party apps creates a lot of unnecessary complexity.
Since then, I have used GitHub actions for a few projects, but mostly stayed away from re-using and combining actions (except for the obvious use cases of "check out this branch").
YAML is perfect for simple scenarios. But users produces with it really complex use cases.
Is it possible to write Python package that based on YAML specification produces Python API? User will code in Python and YAML will be the output.
I was working on YAML syntax for creating UI. I converted it to Python API and Im happy. For exmple, dynamic widgets in YAML were hard, in Python they are strightforward.
Absolutely agreed. Well said and I'll be stealing this explanation going forward. Hell, just local running with simplicity and ability to test is a massive win of #2, aside from just not dealing with complex YAML.
It can be any scripting language, Python or Typescript via Deno are good choices because they have batteries-included cross-platform standard libs and are trivial to setup.
Python is actually preinstalled on Github CI runners.
Exactly, I showed here how we just write plain shell scripts. It gives you "PHP-like productivity", iterating 50 times a minute. Not one iteration every 5 minutes or 50 minutes.
I appreciate this perspective, however, after spending 6mo on a project that went (2) all the way, never again. CI/CD SHOULD NOT be using the same scripts you build with locally. Now, we have a commit that every dev must apply to the makefile to build locally, and if you accidentally push it, CI/CD will blow up (requiring an interactive rebase before every push). However, you can’t build locally without that commit.
I won’t go into the details on why it’s this way (build chain madness). It’s stupid and necessary.
This comment is hard to address without understanding the details of your project, but I will at least say that it doesn't mirror my experience.
Generally, I would use the same tools (e.g. ./gradlew build or docker build) to build stuff locally as on CI, and config params are typically enough to distinguish what needs to be different.
My CI scripts still tend up to be more complicated than I'd like to (due to things like caching, artifacts, code insights, triggers, etc.), but the main build logic at least is extracted.
1) Program with github actions. Google "how can I send an email with github actions?" and then plug in some marketplace tool to do it. Your workflows grow to 500-1000 lines and start having all sorts of nonsense like conditionals and the YAML becomes disgusting and hard to understand. Github actions becomes a nightmare and you've invited vendor lock in.
2) Configure with github actions. Always ask yourself "can I push this YAML complexity into a script?" and do it if you can. Send an email? Yes, that can go in a script. Your workflow ends up being about 50-60 lines as a result and very rarely needs to be changed once you've set up. Github actions is suddenly fine and you rarely have to do that stupid push-debug-commit loop because you can debug the script locally.
Every time I join a new team I tell them that 1 is the way to madness and 2 is the sensible approach and they always tepidly agree with me and yet about half of the time they still do 1.
The thing is, the lack of debugging tools provided by Microsoft is also really not much of a problem if you do 2, vendor lock in is lower if you do 2, debugging is easier if you do 2 but still nobody does 2.