19 October, 2021
Multi-env AWS with Terraform and Terragrunt in one hour
tl;dr This guide is for people who want to get started with AWS and Terraform, who want to do it properly, securely and want to get building FAST.
In 2020 I left my permanent job and started Merge Mamba (now shut down). I made a lot of mistakes on the way (blog post to follow) and one of the big ones was spending too long tweaking the tech stack, and not enough time actually figuring out if I had a viable product.
My investment in tech has left me with a solid base from which I can iterate new ideas really fast. Much of that is down to the time I spent getting Terraform to play nicely with a multi-account AWS setup. Over the past few years I've seen a lot of different Terraform setups in everything from startups to large financial institutions and if you:
- Are an AWS user
- Are a Terraform user (you should be)
- Need to build FAST but securely, and in a way that'll make future hires/devops happy
...the setup I'm describing here will be useful. It's nothing groundbreaking and there may be better ways to do it. All I've done is read the docs (pretty much everything in here is lifted from other tutorials) and figure out the slightly less obvious stuff but hopefully it'll get you going with a multi-env AWS infrastructure stack in less than an hour.
Prerequisites
- Install Terraform
- Install Terragrunt
- Have an AWS account for each environment and one root account.
- Have admin access to AWS (for bootstrapping)
The Example Repo
Clone or fork the example repo. The example repo is split into two
sections. All .tf
files are in the modules
directory and the supporting Terragrunt code is within the env
directory. Placeholders are of the form <YOUR_.*>
and indicate where you need to fill values in. There are a few
READMEs lying around in the repo that may also be useful.
Terraform structure
Terragrunt is a wrapper around Terraform that adds some functionality to make it easier to write modular, parsimonious Terraform. See the docs for more but at a high level:
All common Terraform configuration (e.g. providers, common vars) is defined in env/terragrunt.hcl
and ensures
commonality between your environments.
Each subdirectory of env/
represents a different AWS account and there's an env.yaml
file with per-env configuration
in each one that is made available to your terraform. Edit env/<env_name>/env.yaml
to include the correct AWS account
id and a name for your environment. Ensure these correspond to the AWS accounts you created above.
Each subdirectory in /modules
is a terraform stack that can be applied to any environment that has a corresponding
directory in /env/<env_name>/
. You'll notice that /env/root/iam_users
exists but /env/prod/iam_users
does not as we want to apply this module only in the root account. In this way you can restrict resources to certain
environments at a module level (as well as at a resource level by switching on the environment name that's made
available to your terraform code) via the inputs {}
block in the top-level terragrunt.hcl
file. You can introduce
dependencies between these modules if you need to pass data between them (see the Terragrunt docs for more).
env \ <- each env has a directory here
- root
- prod
- dev
modules \ <- each tf stack has a directory in here
- users
- roles
- product_a
- product_b
Environments
This guide assumes that you're using dedicated subdomains for each non-prod environment. For example:
(*.)henrycourse.com
refers to my production environment(*.)<env>.henrycourse.com
refers to my environment<env>
You don't have to do it like this, but it is an assumption of this guide.
AWS Accounts And Users
In this guide I'm going to assume that you have two environments: prod
and dev
. With this setup, it's trivial to add
new environments (there is zero repetition of terraform code, just add a new subdirectory in the env dir) and override
per-environment configuration as required.
Each environment has its own AWS account that contains all of its resources and roles required to access them. This strong separation is great for ops and security.
All AWS users live in the root
account and access environments by assuming roles in those environments. This makes
securing access to environments very straightforward and means that user management is reasonably centralised.
Create AWS accounts for each environment that you need.
AWS auth
I'm assuming for this guide that your default
AWS profile points to a user or role in the root account with admin
permissions in all the accounts you own. Once you've created users and roles, you can stop using the root user and
switch to a profile-per-env approach as I've assumed for this guide. e.g.
[default]
region = eu-west-2
output = json
[profile prod]
role_arn = arn:aws:iam::...
source_profile = default
[profile dev]
role_arn = arn:aws:iam::...
source_profile = default
If that's not the case you'll need to edit the provider block in env/terragrunt.hcl
to change the profile selector:
provider "aws" {
region = "<YOUR_HOME_REGION>"
profile = "${local.env_vars.env == "root" ? "default" : local.env_vars.env}"
}
provider "aws" {
alias = "useast"
region = "us-east-1"
profile = "${local.env_vars.env == "root" ? "default" : local.env_vars.env}"
}
You don't need to have this sorted now if you're running Terraform as an AWS root/admin user but once you've created roles in the next steps you
Creating a state store
Step one involves creating an s3 bucket to hold remote Terraform state. I've chosen to have a single bucket in
the root
account that holds the state for all envs under separate keys. With some minor tweaking you could use one
bucket per environment.
To create the state bucket:
- In the
env/root/tf_state_s3
directory, runterragrunt apply
to create the state bucket using local state, check the diff and type yes to confirm. - Uncomment the
terraform {}
block inmodules/tf_state_s3/tf_state_s3.tf
. - Uncomment the
remote_state {}
block inenv/terragrunt.hcl
and add the name of your state bucket. - In the
env/root/tf_state_s3
directory, runterragrunt apply
and type yes when prompted to copy the local state to the s3 bucket.
The s3 bucket is now ready to go as a remote state store.
Creating users
Run terragrunt apply
in the env/root/iam_users
directory to create a user called terraform
in the root account.
Add as many users as you need or replace this step as required with your chosen auth providerl
Creating roles
We'll now create a role called terraform
in the prod
and dev
accounts. Run terragrunt apply
in
the env/<env>/iam_roles
directories to do this. N.B. This creates a role with admin privileges in those accounts that
is assumable by the terraoform
user. Notice the code in /env/prod/iam_roles/terragrunt.hcl
that adds a dependency on
the user stack in the root account:
dependency "iam_users" {
config_path = "../../root/iam_users"
}
inputs = {
terraform_user_arn = dependency.iam_users.outputs.terraform_user_arn
}
This is an example of Terragrunt's module syntax and lets you keep separation between resource stacks in different accounts while explicitly structuring the dependencies between them.
...and that's it
At this point you have a multi-account terraform setup with users defined in the root account, the ability to control access to environment accounts and the ability to create environments as required. Your terraform can be split into self-contained stacks, and you can easily control what gets deployed where and the dependencies between them (including across account boundaries)
Extra reading
DNS and certificates took me a little while to figure out but if you need them, you might find this section useful. Code is provided in the repo.
Route53
There are no zones defined in the root account and each environment defines their own zones. This is nice because once
again it keeps DNS record management separated per-environment and allows for easy access control. One thing that
complicates this slightly is that you need to provide the subdomain nameservers in a new NS record in the top-level
zone (which in this setup lives in the prod
environment).
See here for more
info.
To do that with Terragrunt you can output the subdomain nameservers and provide them as input to the prod/route53
stack where you can create the required NS records.
Letsencrypt certificates
If you're a Let's Encrypt user (you should be) you might find this certificate setup useful. If you're using Route53 DNS, Terraform can automatically update and store LE certificates in ACM. This makes it very easy to generate, roll and deploy certificates to anywhere in AWS.
The ACME Terraform provider needs access to Route53 to do this and if you want to use your certs in Cloudfront, you'll
need to create a duplicate cert in the us-east-1
region. To see this all in action, check out the terraform
in modules/certs/certs.tf
. It'll create certs in your home region and us-east-1
and has the right lifecycle
configuration to allow rolling of certificates when they're in use. This is still a bit of a pain though, check the
README in the directory for more.