A Month of Terraform

pqb · on Nov 22, 2020

> terraform-lsp is supposed to provide autocomplete, but it mostly doesn’t, in my experience.

I find Terraform for Jetbrains IDEs to work hundred times better than the VSCode one. I really can recommend it, does magic to autocompletion, peek definition, snippets insertion and linting TF/HCL files.

Author of this blogpost have not mentioned Terragrunt [0], while I think it is worth mentioning. It is nice tool, especially if you work on bigger projects having multiple modules and per-environment variables.

Also, a tip from a person, who works in team using terraform - use brew/apt repositories to keep the binaries up-to-date or at least on the same version as your team mates. I remember at least few situations, when patch update of a terraform binary was crucial to make some issues disappear.

[0]: https://terragrunt.gruntwork.io/

Psst... I am also using Terraform to bootstrap Pop_OS! with my dotfiles and it is surprisingly nice enough to mimic declarative and atomic NixOS configuration :)

dastx · on Nov 23, 2020

> Also, a tip from a person, who works in team using terraform - use brew/apt repositories to keep the binaries up-to-date or at least on the same version as your team mates.

Use tfenv instead. That way you can have a .terraform_version file in your repo and tfenv will refuse to run without having the right version.

joobus · on Nov 22, 2020

But he did mention Terragrunt:

> Terraform: Up & Running is excellent, and Terragrunt makes it even easier. Huge thanks to their team for providing the duct tape we need.

pqb · on Nov 22, 2020

Ouch, what a shame on my side. Sorry, I must have missed it. Thanks for pointing it out.

leetrout · on Nov 23, 2020

asdf has a terraform plugin and solves the problem of folks being on the wrong versions quite nicely.

https://asdf-vm.com/#/

ciceryadam · on Nov 22, 2020

For distributed work on environments you ideally you want to have something like a gitlab pipeline that does the tf plan for you automatically, with a manual approve button of sorts. That way you a consistent tf version defined in .gitlab-ci.yml, plus due to centralized logging it's easier to find when and what changed in the envs.

pqb · on Nov 22, 2020

I assume you are talking about a similar tool to the Atlantis [0] / Terraform SaaS offering [1].

[0]: https://www.runatlantis.io/

[1}: https://www.hashicorp.com/blog/terraform-collaboration-for-e...

ciceryadam · on Nov 22, 2020

Yeah, exactly the same functionality, but just implemented with a different toolset: https://docs.gitlab.com/ee/user/infrastructure/

dastx · on Nov 23, 2020

This is super interesting. I never realised gitlab has builtin terraform support. Is there a demo or screenshot of what it all looks like?

voffk · on Nov 23, 2020

https://www.youtube.com/watch?v=2NNpJcYtEDw

dastx · on Nov 23, 2020

Thank you! They did a great job!

vbsteven · on Nov 23, 2020

I implemented this on top of bitbucket pipelines with a 'manual' pipeline step. The pipeline runs the plan step automatically and outputs it on the console. Then pipelines shows a button next to it called 'deploy' which is only clicked after manually verifying the plan results.

cies · on Nov 23, 2020

> or at least on the same version as your team mates. I remember at least few situations, when patch update of a terraform binary was crucial to make some issues disappear.

I find it a little evil that Terraform is used so widely in production yet still hangs on to the "0.x.y" semantic version meaning: we break API when ever we want. It should be more stable for a tool so widely used. C'mon Hashicorp, be fair to us and release a 1.0!

danpalmer · on Nov 22, 2020

The arguments around the language and syntax inconsistencies here really echo my experience.

I can't help but think it just wasn't designed, but patched together by different people over time, with no over-arching vision. There's nothing in Terraform that couldn't just be YAML with some semantics (like Ansible which I find much more approachable), and while a custom language can absolutely do better than that, it really needs to justify itself, and I'm not sure Terraform does yet.

There's some hints of why it should be a custom language, but I honestly think it needs an overhaul focusing on consistency and clarity. These things really make a difference, I'd say it's one of the main reasons Go took off in such a big way for example.

thraxil · on Nov 23, 2020

> There's nothing in Terraform that couldn't just be YAML with some semantics (like Ansible which I find much more approachable)

One secret of Terraform is that you can define resources with JSON instead of HCL (https://www.terraform.io/docs/configuration/syntax-json.html). The catch is that all the interpolation and iteration stuff is implemented in HCL, so if you use JSON, you probably want something else to do that. But, eg, Jsonnet works beautifully for that kind of thing: https://github.com/google/jsonnet/blob/master/examples/terra...

nerdbaggy · on Nov 22, 2020

I have been using more and more of https://www.pulumi.com/ for my deployments. It’s nice that the deployment code is written in js/python/go/.net so I can make it as customizable as I want.

thayne · on Nov 23, 2020

I'm seriously considering switching to pulumi, mostly because it seems a lot more flexible. Not to mention that developers are already familiar with the language rather than having to learn all the quirks and inconsistencies of hcl on top of using the tool itself

macca321 · on Nov 23, 2020

In a polyglot shop, it can actually be quite handy that HCL is it's own relatively simple language.

thayne · on Nov 23, 2020

From my ~1 year of experience with terraform, the simplicity of HCL leads to some pretty complicated config. In particular the absence of a true if, and the inability to define custom functions, or use any kind of condition or iteration on modules.

chucky_z · on Nov 23, 2020

try out https://github.com/hashicorp/terraform-cdk i've had success with it thus far. i'm mostly using mainstream providers at this point though.

nerdbaggy · on Nov 23, 2020

Yup that is why we use it. And it’s great since anything can be a data source

stevekemp · on Nov 23, 2020

I keep seeing this popup in search results, and I'm curious to try it in anger.

I've recently switched roles, and have gone from sharing a pretty large terraform deployment on AWS/GCP to managing infrastructure written with cloudformation and troposphere.

(Trophosphere is a python library that generates cloudformation templates. Nice to have a real programming language, but still pretty constrained by the fact that the output is "just" cloudformation.)

Of the two I have had less pain with terraform, but both have caused their days of pain in different ways.

Aeolun · on Nov 23, 2020

I cannot recommend Pulumi enough. Terraform was a huge step up from Ansible, and Pulumi is the same step up from Terraform.

Writing a for loop to provision multiple machines just feels incredibly natural.

diveanon · on Nov 23, 2020

I agree, I'm using pulumi with ts and it's pretty great to be able to develop / test my deployment pipeline using the same tools as my core services.

michaelmcmillan · on Nov 23, 2020

Terraform has released something very similar, but not stable yet: https://www.hashicorp.com/blog/cdk-for-terraform-enabling-py...

7sigma · on Nov 23, 2020

Same here, at my previous work, they used python to generate terraform hcl. I told them you might as well use pulumi

je42 · on Nov 23, 2020

yup. pulumi has much more same syntax and management.

however, some parts still use terraform underneath. if stuff doesnt work - i sometimes find myself on terraform docs/code.

but all in all a good trade off for me compared to pure terraform.

jen20 · on Nov 23, 2020

To clarify, Pulumi does not use Terraform - it simply uses the CRUD operations from some Terraform providers. The core of Terraform is not involved.

nunez · on Nov 23, 2020

Seconded. Pulumi is great.

kubanczyk · on Nov 23, 2020

For a 2020 piece, two things seem outdated:

> And you can’t even use the for_each trick with module imports. It just isn’t supported.

Works since Terraform 0.13 (released in August 2020).

> data "provider_thingy" "my_name_for_this_data" > Though, like, why the quotes around the provider thingy?

Since 0.12 you can just write

    data provider_thingy my_data {}

    resource provider_thingy my_resource {}

They seem to keep the quotes for compatibility with v0.11. Yes, they should have updated the intro/examples to the simpler format.

> Required vs optional parameters are not very clearly called out and are not at all segregated.

Yes, in AWS provider. Other providers format the docs better and clearly segregate the required from the optional.

igetspam · on Nov 23, 2020

If you've been using TF for any amount of time though, 0.13 is a breaking change (because semver... oh wait...). The migration from 0.11 to 0.12 was a PITA and if you're using any third party modules or resources, they still don't have consistent support for 0.13. Usung 0.12 currently gives you the most consistent experience.

This is not bashing TF, it's just reality. I wish I had time to update everything to 0.13. The third party provider support alone makes it valuable.

kubanczyk · on Nov 24, 2020

I read you - I'm currently still going through 0.13 upgrade. The pain however is considerably smaller than 0.12 upgrade and I've already observed two 0.12 bugs disappearing on 0.13. The bugfix-backporting seems sluggish and it doesn't encourage me at all to keep using 0.12 despite it is "the most consistent experience".

_5yoy · on Nov 22, 2020

If your only comparison is "I used to work with Heroku", Terraform might seem not great. I would argue this is a blub example, in that this person's lens does not contain enough context or experience to be able to assess value.

Terraform's value becomes clear when moving in scenarios like moving from CloudFormation to Terraform, or when trying to integrate a second cloud's resources into your infrastructure workflow. Without experience in anything other than Heroku, of course a tool designed to do many many things is going to seem complex and at times frustrating.

> Iteration times are way longer than with even mobile apps. Like, “you’re liable to task-switch while waiting to see plan output” longer.

I'm over here laughing in my "over an hour of CloudFormation only to have a run fail, and then a rollback fail, and have to contact AWS support" life.

pqb · on Nov 22, 2020

> I'm over here laughing in my "over an hour of CloudFormation only to have a run fail, and then a rollback fail, and have to contact AWS support" life.

Oh, I see I am not alone in having the same experience with the AWS CloudFormation (CF) update flow. I have started feeling "good enough" in CF after about 6-8 months to write templates from my memory (accompanied with a documentation page). The funny thing about the CF is fact, the all people I know had working flow based on searching GitHub/Gist over an example, edit, refine, deploy, push fixes, deploy, rinse & repeat.

In the blogpost I have noticed mentioned Pulumi. To be honest, I have not used it yet, but I have tried AWS CDK [0]. I suppose it is AWS's go-to solution for everybody who wants to use the CloudFormation for saving deployment stages and also have programmed templates in languages of their choice (like a TypeScript or Python). It is interesting solution that I suggest investigating if you haven't done already. It supports "importing" CF templates as-is, so they can be incrementally translated to CDK.

[0]: https://github.com/aws/aws-cdk

throwaway894345 · on Nov 22, 2020

I came to Terraform optimistically from CloudFormation, thinking it would fix many of the latter’s warts, and it sort of has, except it’s introduced as many of its own. I’m still undecided about which set of problems I prefer, but in general I’m disappointed with Terraform. Some particular things that bother me: I get the distinct impression that it’s trying to badly reinvent a programming language (with “locals” replacing variables, “variables” replacing parameters, and “modules” replacing functions, “for each” replacing loops or comprehensions, and so on). Additionally, I find myself wanting rollback support like CloudFormation has. It’s unsettling that TF makes it easy to get into a bad state. Further still, (and maybe this is just my organization’s use of Terraform), it seems the convention is to split the whole architecture up into lots of root modules, but the links between resources in these modules are basically string identifiers (e.g., ARNs in the AWS world) which will likely change if the resource gets deleted and recreated or if AWS changes their naming conventions or so on. Similarly, people seem to build these identifiers from strings instead of referencing them directly from resource attributes (I’ve seen this practice advertised in some of the AWS provider docs IIRC) which is bad for all of the same reasons that pointer arithmetic is bad. I do like that custom providers aren’t full-on lambdas that I have to deploy, unlike in the CloudFormation world, but mostly I’ve been disappointed.

I wonder if Pulumi or AWS CDK is the solution I’ve been searching for, or if I should just stick to generating CloudFormation from YAML.

pm90 · on Nov 22, 2020

> I'm over here laughing in my "over an hour of CloudFormation only to have a run fail, and then a rollback fail, and have to contact AWS support" life.

Heh, same. Any of the cloud-provided orchestration tools (Cloudformation, Openstack Heat etc.) are great only for the most basic tasks; using them to provision complex infrastructure is just begging for a world of hurt.

That being said. I think terraform could do better. I use terraform a lot, and yet I agree with the authors complaints that the syntax can be super confusing, is not documented very well, and the providers have their own idiosyncrasies. The last one isn't strictly an issue with terraform, its with the provider implementation. But... if terraform aims to be the Swiss army knife of infrastructure provisioning, I think criticism that its hard to standardize even within the same provider is fair.

wgjordan · on Nov 22, 2020

> Heh, same. Any of the cloud-provided orchestration tools (Cloudformation, Openstack Heat etc.) are great only for the most basic tasks; using them to provision complex infrastructure is just begging for a world of hurt.

I disagree- I can't speak to OpenStack Heat (and I have no idea what you're referring to by 'cloud-provided orchestration tools' beyond these two specifically), but my own experience using CloudFormation to provision complex infrastructure is that it is in fact great for all but the most basic tasks (where any orchestration tool would just add unnecessary overhead).

pm90 · on Nov 22, 2020

> (and I have no idea what you're referring to by 'cloud-provided orchestration tools' beyond these two specifically),

- AWS: cloudformation - GCP: deployment manager https://cloud.google.com/deployment-manager/docs/quickstart

I imagine other clouds have similar tools (or perhaps they have converged towards more general ones like Ansible or Terraform).

As for your disagreement: you're free to have your own opinion on this. But my personal experience has been that eventually converging infrastructure provisioning systems like CF have complex failure modes that make it hard to modularize and scale them up.

With terraform: the cloud provider is reduced to a dumb API, and most of the issues you see can be resolved client side. Whereas when dealing with issues with cloudformation, its not something you can figure out yourself, you need to hope that the error is something that's clearly displayed to you and/or open a support tix with AWS.

_5yoy · on Nov 22, 2020

Azure: Azure Resource Manager templates

ciceryadam · on Nov 22, 2020

IMHO on AWS side: if you squint, you could also add AWS SAM among these orchestration tools

bassman9000 · on Nov 23, 2020

my own experience using CloudFormation to provision complex infrastructure

If we're talking anecdotes, my experience is vastly different. Difficult to assess what's going to be a replacement (other than browsing the doc), replacements that cause a chain of other replacements, leaf replacement that fails due to some syntax in a SSM doc that wasn't checked at the very beginning, wasting 45 min, so everything gets rolled back, but you used Retain policies, so those ASG groups are now not managed, and still live, so you need to delete them manually.

Building complex means breaking down stacks in multiple pieces, and you can only use URLs. Which can only point to S3. Which means you have to pre-upload your sources there. For which you have to build the tooling, because aws provides nothing. So not only you have to build your infrastructure: you need to build the infrastructure to build and develop your infrastructure.

You want to know why a nested stack is going to replace that ASG you though it was safe? Well, you can always dive into the nested stack changeset... aaand nothing there. You can't. Maybe in the parent stack JSON.

Complexity without loops? Good luck. Or lots of copy pastes, I guess. And the Conditions are just rudimentary and clunky.

The Resources/Events UI is just meh with no sensible sizing for the otherwise huge columns (big names, big ARNs). Impossible to get the sorting of new events coming in right. Every refresh reshuffles the rows.

And cfn L1 Support is hit-or-miss: 50% of the times it's simply not useful, because of the complexity of the infrastructure. We just get the problem echoed back to us. I'm lucky we have Enterprise, and can escalate. There would be issues we wouldn't have solved in weeks otherwise.

I very much like the fact that we have a state management tool. But calling it great is an overstatement.

pugz · on Nov 23, 2020

I mostly agree with you, but on this one:

> Difficult to assess what's going to be a replacement (other than browsing the doc)

Do you know of any better solution than docs? Or better implementations of docs? I struggle to imagine how they could handle this better - but I've been using CF for so long that my imagination is limited.

bassman9000 · on Nov 23, 2020

I'd be happy if they just showed:

- the nested stack elements that are going to change, not only the parent ones

- the logical members/fields that prompt the change

from the changeset window. All this info they have, and should be trivial to show, which tells me they either don't use their own tooling (no dogfooding), or they have internal, better diagnostic tools not exposed to customers.

wgjordan · on Nov 23, 2020

> the nested stack elements that are going to change, not only the parent ones

You should be pleased to hear that 'change sets for nested stacks' support was just released last week:

https://aws.amazon.com/about-aws/whats-new/2020/11/aws-cloud...

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...

bassman9000 · on Nov 24, 2020

Nice! Thanks.

For others like me who were wondering why we didn't see this change:

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...

create-change-set – Change sets for nested stacks is not enabled by default for the AWS CLI. To create a change set for the entire stack hierarchy, specify the --include-nested-stacks parameter. For more information, see To create a change set (AWS CLI).

Edit: grammar.

ciceryadam · on Nov 22, 2020

> Any of the cloud-provided orchestration tools (Cloudformation, Openstack Heat etc.) are great only for the most basic tasks; using them to provision complex infrastructure is just begging for a world of hurt.

It's also fun when you have to standardize something that was either created by hand in the Web UI (and has a bunch of hidden default values set), or was by some other orchestration tool, and then you have to import/recreate it in terraform. It's usually a PITA, but you'll learn a lot!

pm90 · on Nov 22, 2020

Yes! Its great to capture a bespoke configuration in code. You may already know this but there is a tool called terraformer that has limited support for doing just this: https://github.com/GoogleCloudPlatform/terraformer

wgjordan · on Nov 22, 2020

> I'm over here laughing in my "over an hour of CloudFormation only to have a run fail, and then a rollback fail, and have to contact AWS support" life.

Your experience must be over four years out of date, because CloudFormation has supported the ability to continue updating a rollback since February 2016 [1].

[1] https://aws.amazon.com/blogs/devops/continue-rolling-back-an...

acdha · on Nov 23, 2020

Their experience is accurate. If you use CloudFormation for anything non-trivial you’ll be rudely corrected from thinking that this works and then spend time migrating to Terraform so you never go through that again, plus you’ll get support for most new AWS features faster. I spent 2019 & early 2020 helping people make that switch after getting burned by the CF “impossible” situations happening.

pqb · on Nov 22, 2020

Yes, it is true. However, I am hesitant to say the AWS CloudFormation is a bulletproof tool that does not need help from the support side. If somebody was too impatient and manually removed a resource that was managed by CFN, then he might end with a deadlock, where only help was actually a support team member. Been there, don't repeat that error - please do not create (not auto-detectable) drift changes on a purpose (it could be hard to modify/remove the stack in future).

Edit: My memories are from Q1 2020.

wgjordan · on Nov 23, 2020

> If somebody was too impatient and manually removed a resource that was managed by CFN, then he might end with a deadlock, where only help was actually a support team member.

This specific situation can be resolved by doing a 'continue update rollback' and skipping the already-deleted resources- see the troubleshooting guide [1] for more detailed info.

[1] https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...

_5yoy · on Nov 22, 2020

My experience is from 2019, friend.

wgjordan · on Nov 22, 2020

Needing to contact support for this issue in 2019 only means that the user didn't know how to use the 'continue update rollback' feature properly, which was a feature added back in 2016 specifically to support the rollback-failed scenarios.

_5yoy · on Nov 22, 2020

It's weird the frequency with which the AWS Support Reps agree the issues we find aren't those that would've been resolvable on our own, as each time we file a ticket we follow up with our reps for appropriate training and support.

I understand that you like this tool, but you do not need to fight everyone suggesting it doesn't work as well as you want it to.

throwaway894345 · on Nov 22, 2020

I’ve also run into these from time to time, at an old gig where we didn’t have a support contract, and thus we just had borked stacks here and there and prayed that we would never similarly bork prod. On the other hand, we worked hard to get our CloudFormation templates to a state that we could rebuild prod from scratch fairly easily if it came down to it.

latchkey · on Nov 23, 2020

Terraform died for me, not because it didn't work well to setup one set of resources, but because when I went back to update things much later and a backwards breaking new version of TF totally broke my state file and the 3rd party resources I was using, had not caught up to the new version of TF yet. Short of starting all over again, I was dead in the water.

As much as I really liked how quickly I was originally able to get my infra up and running with TF, if I'm ever in the need to have similar functionality again, I'll find something else to use.

dj_mc_merlin · on Nov 23, 2020

It is in v0, breaking changes are expected. If the new version is not compatible, wait until it is and then upgrade. It is also completely reasonable to not want to use it until it is stable.

cies · on Nov 23, 2020

Keeping a software in "0.x" land while being used in production so much is a bit disingenuous, dont you think?

I mean: it's up to Hashicorp how to version their software, but this is simply messing with expectations in my book.

HatchedLake721 · on Nov 23, 2020

v0 or not doesn't matter, Terraform doesn't follow semantic versioning.

kubanczyk · on Nov 23, 2020

Of course it does follow semantic versioning, so far.

> Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable.

https://semver.org/#spec-item-4

> At HashiCorp we take the idea of a v1.0 very seriously.

https://github.com/hashicorp/terraform/issues/15839#issuecom...

> Terraform 0.14 release. That's our last chance to deprecate features or introduce breaking changes, before 1.0. Will we have 0.15? That's anyone's guess at this point. I honestly don't know.

https://www.hashicorp.com/resources/the-path-to-terraform-1-...

HatchedLake721 · on Nov 24, 2020

Calling a software that's been out for 6+ years powering everyone's IaaC from startups to multi-billion companies around the world as "initial development" to fit into SemVer is a stretch for me.

sleepy_keita · on Nov 23, 2020

> It mostly works!

> When it doesn’t, it generally fails in a useful way, and then I can fix it and try again.

This is the "killer feature" for me when comparing Terraform and CloudFormation. I know they've been working on making CF better, but the way Terraform handles errors (and allows you to pick up from where you left off, instead of waiting around for stuff to roll back) is a lot better suited to tight experimentation and feedback loops.

jeremya · on Nov 23, 2020

I think this feature lends itself well to tight feedback loops, as you said. However, after I crossed the painful divide from CloudFormation novice to... well, whatever is after that, CloudFormation rollbacks became a friend instead of an enemy. The ability to return to the last known good checkpoint is a powerful feature that is extremely useful in production systems and the pipelines that deploy them. In those scenarios, you're not experimenting and learning, but enhancing and evolving existing infrastructure.

arkadiyt · on Nov 22, 2020

Some secrets (like IAM access keys) can be encrypted with a public PGP key.

Some other secrets (like RDS instance master passwords) can't be encrypted, but I like to use a trick where you have a local provisioner that runs after the instance gets created, which updates the just created DB and sets the password to a random value and prints it to stdout so you can save it in your secrets management tool of choice. The value saved in the terraform state is no longer valid.

Some other other secrets, like elasticache auth tokens - there's no solution, they'll always be available as plaintext in the statefile.

Overall agreed that Terraform needs to do better here.

time0ut · on Nov 22, 2020

I wish they had better coverage for PGP keys.

On RDS, my typical pattern is to use Secrets Manager to auto rotate the master password immediately. AWS has sample Lambdas for that various databases.

gtsteve · on Nov 22, 2020

I have a lambda that runs every day to cycle the RDS master password. I create the password using the random provider, and save it to a secrets manager secret - the first time the environment is created, the secret is in plaintext in the state, but it will not be valid the next time the lambda runs (less than 24 hours).

You can do the same with ElastiCache auth tokens, and ensure your application reads the token from a value in secrets manager.

ec109685 · on Nov 22, 2020

Is there a race condition with the way you rotate the passwords?

gtsteve · on Nov 23, 2020

No, once you've logged in with MySQL, changing the password doesn't close the connection.

For rotating application passwords we use the same technique but we update the usernames, i.e. app_1@'%' becomes app_2@'%', and then rotates back to app_1@'%' to prevent issues with unsynced config files.

nikolay · on Nov 23, 2020

I am using Terraform daily, and it's excruciating at times. First, HCL is very limited - HashiCorp didn't even use it for Sentinel! I won't switch to TypeScript either - I wish they've adopted an embedded language like Starlark like in this POC [0].

There are many inconsistencies; refactoring is very, very painful. While `terraform import` and `terraform state mv` work, it's all manual, and the first doesn't work with their SaaS - Terraform Cloud as imports, unlike runs, are always local, but sensitive variables are only available in the cloud. As pointed out, secrets are in plain text - GCP state even had my personal access token, which it's not supposed to as it's not something that has to be shared with the team.

The good thing is that the velocity is greatly improving - we had v0.13 this year, and v0.14 is almost out - many features of v0.15 are done, so that pace compared to v0.12 is new!

My biggest issue is how much copypasta you need for just reusing modules! There's no clean way to do Dependency Inject, and you have to replicate code all over the place as you can't "export" resources or data sources - you can only export primitive values, and then you need to have a data source, which uses that primary key of the resource.

Terraform could turn into a huge, missed opportunity. I hope they realized the potential and that they need to execute fast.

Also, it's very maddening that their prioritization is based on the number of thumbs-ups!

Last but not least - many key providers are run by people who don't follow standards and are working part-time on key projects instead of HashiCorp making sure there's great DX with major vendors. One such example is the GitHub provider. A huge fiasco recently had a backward-incompatible minor release, which also didn't update the docs. After three weeks, the maintainer was still considering unreleasing when people already have found workarounds and reverse-engineered the issues and the missing documentation. Same provider still does not provide organization-level secrets even though this features has been available for more than 6 months!

[0]: https://ascode.run/

kubanczyk · on Nov 23, 2020

> you can only export primitive values

That's seriously outdated. The v0.12.0 changelog:

"Resource and module object values: An entire resource or module can now be treated as an object value within expressions, including passing them through input variables and output values to other modules, using an attribute-less reference syntax, like `aws_instance.foo`"

nikolay · on Nov 24, 2020

I am well-aware of that and already using, but I don't want to do that - I want to export a reference to it, not a copy of it.

sali0 · on Nov 23, 2020

There are definitely some oddities in HCL. Regarding the dynamic blocks, it gets weirder. If you have a dynamic block inside of a resource that is using for_each, you can still reference the 'each' value alongside the value that the 'dynamic' block is looking at.

resource "example_resource" { for_each = example_map

  dynamic "foo" {
    for_each = bar
    content {
      example_setting = each.value.setting
      new_baz = foo.baz
    }
  }

}

The original for_each is referencing its object as 'each', while the for_each inside the dynamic is referencing the iteration as 'foo'. Why not just name the for_each inside the dynamic block as something else?

With that said, I absolutely love Terraform, and it is IMO the most powerful tool I've had the pleasure to work with. I also am a fan of DSLs if they are thoughtfully done. I think with some improvements, like some of the ones the author suggests, HCL could be a great language to work with.

zug_zug · on Nov 23, 2020

For me the biggest disappointment is it won't import an existing aws stack into config files. The idea of declarative infrastructure seems amazing, but this implementation hasn't overwhelmed me.

rwoll · on Nov 23, 2020

Check out Terraformer: https://github.com/GoogleCloudPlatform/terraformer

This has been instrumental in importing and generating configs from existing infrastructure.

elango · on Nov 23, 2020

A very good summary, this resonates well with my views. Terraform survived because devops folks mostly come from non- programming background and have low expectations from a DSL. When developers look at TF, there are obvious things that glare

m1keil · on Nov 23, 2020

Agree with pretty much every point in the blog.

I would add to the list the fact that for_each cannot be used with pre-computed values. So if you have a module/resource that outputs a list of resources, you can't just plug it into another resource and use for_each to iterate on.

I also think that using multiple state files is the only way to keep the config from being too fragile. Which is another point in which Terraform provides very little help and why Terragrunt is a popular combo.

pst · on Nov 23, 2020

Terraform certainly has its fair share of quirks, but that's no different from any other language.

However, I think it's fair to say that the infrastructure as code ecosystem is still much younger and therefore less mature. And there are various things that are standard in application development toolchains that don't exist for IaC yet.

Take my focus, use-case specific frameworks for example. If I'm building a web application, I don't write my own request routing or authentication. But for Terraform, the majority of teams have to start from scratch, even for common use-cases. Yes, there are re-usable modules, but they're comparable to libraries and integrating various modules is still a lot of effort.

If my use-case is building a Jamstack website, doing so using Gatsby gets me started faster, gives me a modern developer experience and means I can re-use community tested and maintained components to reduce the bespoke code I have to write and maintain.

I'm trying to do the exact same thing for Iac with Kubestack.

Kubestack is an open-source Terraform framework for managed Kubernetes. If your use-case is provisioning and maintaining EKS, AKS or GKE using Terraform, Kubestack may be worth trying. In my obviously creator-biased opinion.

It helps you with the typical framework like workflow to get started faster and scaffold a repository with one command, then bring up a local development environment with the next.

Yes, the long feedback cycles of IaC can be annoying. That's why I'm trying to improve the developer experience by providing an auto updating local development environment.

For all modules (EKS, AKS, GKE) I maintain as part of the Kubestack framework, I also maintain a local variant. These accept the same input variables, but instead of provisioning cloud resources, provisions "mock" clusters locally using Docker containers as the cluster nodes.

The kbst CLI watches for changes in the repository, and then runs Terraform locally and dynamically replaces the module source of the real cluster module with the local variant. Here's a video showing that in action: https://youtu.be/_VtakP6AdCs

Similarly, I provide a Docker image for each framework release to provide a tested combination of versions of Terraform, it's providers and the cloud CLI (aws, gcloud, az). The images are used to bootstrap, for CI/CD runs and for the occasionally required manual tasks (state mv, etc.) or disaster recovery.

Just to name two examples from the discussion here where the developer experience of Terraform lacks behind the equivalent application development tooling.

Many people underestimate IaC, probably because of all the 5 minute Terraform tutorials and demos out there. But what these miss is that the real work only starts when you have to get your automation ready for day-2 operations.

This is where I'm trying to provide a better developer experience through re-usable use-case specific modules, inheritance based environment configuration to avoid drift, and integration into a convenient but reliable GitOps workflows for teams from local development all the way to production.

Kubestack's code is open-source for two years in December. But I only got around writing documentation after leaving the DevOps consulting job that inspired the framework.

Anyone interested can give it a try: https://www.kubestack.com/ The site has links to the Slack channel (#kubestack on the Kubernetes Slack) and also the source on Github.

baxtr · on Nov 23, 2020

Too bad. I thought this is about changing the landscape