There are two types of github actions workflows you can build. 1) Program with g...

woodruffw · on Sept 22, 2023

This is a great perspective, and one I agree with -- many of the woes associated with GitHub Actions can be eliminated by treating it just as a task substrate, and not trying to program in YAML.

At the same time, I've found that it often isn't sufficient to push everything into a proper programming language: I do sometimes (even frequently) need to use vendor-specific functionality in GHA, mark dependencies between jobs, invoke REST APIs that are already well abstracted as actions, etc. Re-implementing those things in a programming language of my choice is possible, but doesn't break the vendor dependency and is (IME) still brittle.

Essentially: the vendor lock-in value proposition for GHA is very, very strong. Convincing people that they should take option (2) means making a stronger value proposition, which is pretty hard!

MoreQARespect · on Sept 22, 2023

No, you're right it's not necessarily a good idea to be anal about this rule. E.g. If an action is simple to use and already built I use it - I won't necessarily try to reimplement e.g. upload artifacts step in code.

Another thing I noticed is that if you do 1 sophisticated features like build caching and parallelization often becomes completely impractical whereas if you default to 2 you can probably do it with only a moderate amount of commit-push-debug.

flohofwoe · on Sept 23, 2023

I use yaml and gh actions to prepare the environment, define jobs and their dependencies and for git operations, everything else goes into scripts.

MenhirMike · on Sept 22, 2023

Option 2 also makes it easier for developers to run their builds locally, so you're essentially using the same build chain for local debugging than you do for your Test/Staging/Prod environments, instead of maintaining two different build processes.

It's not just true for GHA, but for any build server really: The build server should be a script runner that adds history, artifact management, and permissions/auditing, but should delegate the actual build process to the repository it's building.

hk1337 · on Sept 22, 2023

Locally or if for some reason you need to move off of Github and have to use Jenkins or some other CI tool.

dmtryshmtv · on Sept 22, 2023

Good perspective. Unfortunately (1) is unavoidable when you're trying to automate GH itself (role assignments, tagging, etc.). But at this point, I would rather handle a lot of that manually than deal with GHA's awful debug loop.

FWIW, there's nektos/act[^1], which aims to duplicate GHA behavior locally, but I haven't tried it yet.

[^1]: https://github.com/nektos/act

dlisboa · on Sept 22, 2023

> Unfortunately (1) is unavoidable when you're trying to automate GH itself (role assignments, tagging, etc.)

Can't you just use the Github API for that? The script would be triggered by the YAML, but all logic is inside the script.

But `act` is cool, I've used it for local debugging. Thing is its output is impossibly verbose, and they don't aim to support everything an action does (which is fine if you stick to (2)).

plorkyeran · on Sept 22, 2023

Yeah, I've done quite a bit of Github scripting via octokit and it's pretty simple. Using GHA's built-in functionality might turn a five line script into a one-liner, but I think being able to run the script directly is well worth the tradeoff.

The main thing that you can't decouple from GHA is pushing and pulling intermediate artifacts, which for some build pipelines is going to be a pretty big chunk of the logic.

EddTheSDET · on Sept 22, 2023

How DO you debug your actions? I spend so long in the commit-action-debug-change loop it’s absurd. I agree with your point re: 2 wholeheartedly though, it makes debugging scripts so much easier too. CI should be runnable locally and GitHub actions, while supported with some tooling, still isn’t very easy to work with like that.

MoreQARespect · on Sept 22, 2023

Using the same commit-push-debug loop you do. It just isnt painful if I do 2.

davidmurdoch · on Sept 23, 2023

My GH Actions debugging usually devolves into `git commit -m "wtfqwehsjsidbfjdi"`

no_wizard · on Sept 23, 2023

you could always do git commit -m "" --allow-empty

_nhynes · on Sept 23, 2023

    git commit --amend --no-edit && git push -f

mdaniel · on Sept 25, 2023

We may be splitting hairs given what this thread is going on about, but I strongly advocate for `--force-with-lease` as a sane default versus `-f` so that one does not blow away unexpectedly newer commits to the branch

The devil's in the details, etc, etc, but I think it's a _much_ more sane default, even for single-user setups/branches because accidents can happen and git DGAF

totetsu · on Sept 23, 2023

git commit -m "--allow-empty"

diarrhea · on Sept 23, 2023

You can even allow empty messages.

all2 · on Sept 23, 2023

There are ways to run GHA locally. I've tried out one or two of the tools. [0]

- [0] https://github.com/nektos/act

EddTheSDET · on Sept 23, 2023

I tried Act at one point but couldn't get it to run the whole pipeline correctly, it might have improved since though so I'll try it out again soon

somehnguy · on Sept 22, 2023

Act works pretty well to debug actions locally. It isn't perfect, but I find it handles about 90% of the write-test-repeat loop and therefore saves my teammates from dozens of tiny test PRs.

EddTheSDET · on Sept 23, 2023

> saves my teammates from dozens of tiny test PRs

May have misread this but you know you can push to one branch and then run the action against it? Would reduce PRs if you're doing that to then check the action in master. You have to add a workflow_dispatch to the action: https://docs.github.com/en/actions/using-workflows/manually-...

somehnguy · on Sept 23, 2023

Yeah most of the time that is a good way to test. There are some specific actions that aren't easily tested outside of the regular spot though. Mostly deployment related pieces due to the way our infrastructure is setup.

mook · on Sept 23, 2023

And if you're working on workflows that need to be in PRs, you can make a PR from your fork _to_ your fork.

edgyquant · on Sept 22, 2023

I too wish I could find a nicer way than this to debug.

jahnu · on Sept 22, 2023

The main reason I aim for (2) is that I want to be able to drive my build locally if and when GitHub is down, and I want to be able to migrate away easily if I ever need to.

I think of it like this:

I write scripts (as portable as possible) to be able to build/test/sign/deploy/etc They should work locally always.

GitHub is for automating me setting up the environments where I can run those scripts and then actually running them.

rayhu007 · on Sept 22, 2023

Totally get what you're saying. I once switched our workflow to trigger on PRs to make testing easier. Now, I'm all about using scripts — they're just simpler to test and fix.

I recommend making these scripts cross-platform for flexibility. Use matrix: and env: to handle it. Go for Perl, JavaScript, or Python over OS shells and put file tasks in scripts to dodge path issues.

I've tried boxing these scripts into steps, but unless they're super generic for everyone, it doesn't seem worth it.

toolslive · on Sept 23, 2023

> still nobody does 2.

They don't seem to grasp how bad their setup is, and consequently are willing to understand awful programming conditions. Even punch cards were better as these people had the advantage of working with a real programming language with defined behaviour. "when exactly is this string interpolation step executed? in the anchor or when referenced? (well, it depends)". No it's black box tinkering (you might as well be prompt engineering)

the C in IaC is supposed to stand for code. Well, if you're supposed to code something you need to

   - be able to assert correctness before you commit, 
   - be able to step through the code

If the setup they give you doesn't even have these minimal requirements you're going to be in trouble regardless of how brilliant an engineer you are.

(sorry for the rant)

riquito · on Sept 22, 2023

I agree overall, but you oversimplify the issue a bit.

> can I push this YAML complexity into a script?

- what language is the script written in?

- will developers use the same language for all those scripts?

- does it need dependencies?

- where are we going to host scripts used by multiple github actions?

- if we ended up putting those scripts in repositories, how do we update the actions once we release new version of the scripts?

- how do you track those versions?

- how much does it cost to write a separate script and maintain it versus locking us in with an external github action?

These are just the first questions that pop in my mind, but there is more. And some answers may not be that difficult, yet is still something to think about.

And I agree with the core idea (move logic outside pipeline configuration), but I can understand the tepid reaction you may get. Is not free and you compromise on some things

tomtheelder · on Sept 22, 2023

I think they framed it accurately and you are instead over complicating. Language for scripts is a decision that virtually every team ends up making regardless. The other questions are basically all irrelevant since the scripts and actions are both stored in repo, and therefore released together and versioned together.

I think the point about maintenance cost is valid, but the thesis of the comment that you are responding to is that the prebuilt actions are a complexity trap.

plorkyeran · on Sept 22, 2023

I think you are still envisioning a fundamentally incorrect approach. Build scripts for a project are part of that project, not some external thing. The scripts are stored in the repository, and pulled from the branch being built. Dependencies for your build scripts aren't any different from any other build-time dependencies for your project.

c-hendricks · on Sept 22, 2023

This is a whole lot of overthinking for something like

    #!/usr/bin/env bash
    set -ex

    aws send-email ...

umvi · on Sept 22, 2023

Default to bash. If the task is too complex for bash, then use python or node. Most of these scripts aren't going to change very often once stable.

wry_discontent · on Sept 22, 2023

Default to babashka.

peteradio · on Sept 22, 2023

If build scripts or configuration is shared it might be one of the only times a git submodule is actually useful.

pjc50 · on Sept 22, 2023

I've reached the same conclusion with Jenkins. It also helps if you ever have to port between CI systems.

A CI "special" language is almost by definition something that can't be run locally, which is really inconvenient for debugging.

wbond · on Sept 22, 2023

I have a few open source projects that have lasted for 10+ years, and I can’t agree more with approach #2.

Ideally you want your scripting to handle of the weird gotchas of different versions of host OSes, etc. Granted my work is cross-platform so it is compounded.

So far I’ve found relying on extensive custom tooling has allowed me to handle transitions from local, to Travis, to AppVeyor, to CircleCI and now also GitHub Actions.

You really want your CI config to specify the host platform and possibly set some env vars. Then it should invoke a single CI wrapper script. Ideally this can also be run locally.

cdaringe · on Sept 23, 2023

There’s a curve. Stringy, declarative DSLs have high utility when used in linear, unconditional, stateless programming contexts.

Adding state? Adding conditionals? Adding (more than a couple) procedure calls?

These concepts perform poorly without common programming tools: testing (via compilation or development runtime), static analysis, intellisense, etc etc

Imagine the curve:

X axis is (vaguely) LinesOfYaml (lines of dsl, really) Y axis is tool selection. Positive region of axis is “use a DSL”, lower region is “use a GeneralPurposeProgrammingLanguage”

The line starts at the origin, has a SMALL positive bump, than plummets downwards near vertically.

Gets it right? Tools like ocurrent (contrasted against GH actions) [1], cdk (contrasted against TF yaml) [2]

Gets it wrong? Well, see parent post. This made me so crazy at work (where seemingly everyone has been drinking the yaml dsl koolaide) that i built a local product simulator and yaml generator for their systems because “coding” against the product was so untenable.

[1] https://github.com/ocurrent/ocurrent/blob/master/doc/example... [2] https://docs.aws.amazon.com/cdk/v2/guide/getting_started.htm...

progmetaldev · on Sept 22, 2023

Your advice is sane and I can tell speaks from experience. Unfortunately, now that Github Actions are being exposed through Visual Studio, I fear that we are going to see an explosion of number 1, just because the process is going to be more disconnected from Github itself (no documentation or Github UI visible while working within Visual Studio).

lambda_garden · on Sept 22, 2023

Option 1 is required if you want to have steps on different runners, add approval processes, etc.

I always opt for option 2 where possible though.

kelnos · on Sept 23, 2023

I try to do (2), but I still run into annoyances. Like I'll write a script to do some part of my release process. But then I start a new project, and realize I need that script, so I copy it into the new repo. Then I fix a bug in that script, or ad some new functionality, and I need to go and update the script in the other repo too.

Maybe this means I should encapsulate this into an action, and check it in somewhere else. But I don't really feel like that; an action is a lot of overhead for a 20-line bash script. Not to mention that erases the lack of lock-in that the script alone gives me.

I guess I could check the script into a separate utility repo, and pull it into my other repos via git submodules? That's probably the least-bad solution. I'd still have to update the submodule refs when I make changes, but that's better than copy-pasting the scripts everywhere.

artemisart · on Sept 23, 2023

I agree, but of course all CI vendors build all their documentation and tutorials and 'best practices' 100% on the first option for lock-in and to get you to use more of their ecosystem, like expensive caching and parallel runners. Many github actions and circleci orbs could be replaced by few lines of shell script.

Independent tutorials unfortunately fall in the same bucket as they first look at official documentation to try to follow so-called best practices or just try to get their things working, and I would say also because shell scripts will seem more hacky for many people -unfairly-.

flohofwoe · on Sept 23, 2023

That's true for all CI services, do as little as possible in yaml, mostly just use it to start your own scripts, for the scripts use something like python or deno to cover Linux, Mac and Windows environments with the same code.

konschubert · on Sept 23, 2023

When GitHub actions came out, I felt bad about myself because I had no desire to learn their new programming language of breaking everything down into multiple small GitHub actions.

I think you explained quite well what I couldn't put my finger on last time: Building every simple workflow out of a pile of 3rd party apps creates a lot of unnecessary complexity.

Since then, I have used GitHub actions for a few projects, but mostly stayed away from re-using and combining actions (except for the obvious use cases of "check out this branch").

flohofwoe · on Sept 23, 2023

Github Actions basically only became usable once they started copying features from Gitlab CI. Before that it was an incomprehensible mess.

Compared to Gitlab CI, GH Actions still feels like a toy unfortuantely.

pplonski86 · on Sept 23, 2023

YAML is perfect for simple scenarios. But users produces with it really complex use cases.

Is it possible to write Python package that based on YAML specification produces Python API? User will code in Python and YAML will be the output.

I was working on YAML syntax for creating UI. I converted it to Python API and Im happy. For exmple, dynamic widgets in YAML were hard, in Python they are strightforward.

jjice · on Sept 22, 2023

Absolutely agreed. Well said and I'll be stealing this explanation going forward. Hell, just local running with simplicity and ability to test is a massive win of #2, aside from just not dealing with complex YAML.

MuffinFlavored · on Sept 23, 2023

> our workflow ends up being about 50-60 lines as a result and very rarely needs to be changed once you've set up.

As in, use GitHub Actions as a YAML wrapper around bash/zsh/sh scripts?

flohofwoe · on Sept 23, 2023

It can be any scripting language, Python or Typescript via Deno are good choices because they have batteries-included cross-platform standard libs and are trivial to setup.

Python is actually preinstalled on Github CI runners.

hk1337 · on Sept 22, 2023

1 is to build utilities for 2, IMO. It shouldn't have repository specific information inside and should be easily useable in other workflows.

chubot · on Sept 22, 2023

Exactly, I showed here how we just write plain shell scripts. It gives you "PHP-like productivity", iterating 50 times a minute. Not one iteration every 5 minutes or 50 minutes.

https://lobste.rs/s/veoan6/github_actions_could_be_so_much_b...

Also, seamlessly interleaving shell and declarative JSON-like data -- without YAML -- is a main point of http://www.oilshell.org, and Hay

Hay Ain't YAML - https://www.oilshell.org/release/0.18.0/doc/hay.html

xeromal · on Sept 22, 2023

Github actions calling make commands is my bread and butter.

intelVISA · on Sept 22, 2023

Turns out the real SaaS is Scripts as a Service.

withinboredom · on Sept 22, 2023

I appreciate this perspective, however, after spending 6mo on a project that went (2) all the way, never again. CI/CD SHOULD NOT be using the same scripts you build with locally. Now, we have a commit that every dev must apply to the makefile to build locally, and if you accidentally push it, CI/CD will blow up (requiring an interactive rebase before every push). However, you can’t build locally without that commit.

I won’t go into the details on why it’s this way (build chain madness). It’s stupid and necessary.

Tainnor · on Sept 22, 2023

This comment is hard to address without understanding the details of your project, but I will at least say that it doesn't mirror my experience.

Generally, I would use the same tools (e.g. ./gradlew build or docker build) to build stuff locally as on CI, and config params are typically enough to distinguish what needs to be different.

My CI scripts still tend up to be more complicated than I'd like to (due to things like caching, artifacts, code insights, triggers, etc.), but the main build logic at least is extracted.

SAI_Peregrinus · on Sept 23, 2023

Agreed. I want my builds reproducible. The CI binaries should be bit-for-bit identical to the locally-built ones.