NumTide

DevOps consulting by Developers.

generate-terraform-provider-shim

Dragan Milic, Jonas Chevalier

2020-08-25

A stopgap solution for using community Terraform providers

Two weeks ago, Hashicorp has announced the release of Terraform 0.13.0. This release automates the installation of third-party providers, which was a major pain point when using such providers until now. Unfortunately, this solution puts the burden of providing a service implementing registry protocol onto the third-party plugin providers. With time, we expect that most of the community-provided plugins will be available through such registries, but at the moment, most of them are not.

As a stopgap solution, we have implemented a so-called Provider Shim generator.

Provider Shims

TL;DR: Provider Shim is a bash script that gets placed in the repository, and that downloads, caches and executes the real Terraform provider when accessed. Due to it's size it's very suitable to be checked along side the Terraform code in source code repositories.

Background

Before we can explain what Provider Shim does, it is crucial to understand what a Terraform provider is and how Terraform interacts with the providers.

What is a Terraform provider?

Basic building blocks of Terraform are Resources and Data sources. Resource is something that is managed (created, updated, destroyed) by Terraform, for example an EC2 instance, or a Storage Bucket in Google Cloud. On the other hand, Data source is something that is not managed by Terraform per se (for example: GCP VM that has been manually created), but can be queried by Terraform to get information about it (such as public IP address of the said VM).

Terraform itself does not know how to interact with the resources, it only manages information about the resources, and all the operations (create, read, update, destroy) are delegated to so called providers.

Providers are executable files that are started and terminated by Terraform when needed. Once started, Terraform communicates with a provider through a Unix socket using a GRPC protocol. Providers on their own do not store any state, but are provided by Terraform with the state whenever an operation (such as ReadResource, PlanResourceChange, ..) needs to be performed.

How does Terraform gets its providers?

When terraform init is performed, Terraform downloads all the required modules and parses their HCLs (.tf files). Among other things, HCLs contain provider requirements, describing which versions of providers (and since Terraform 0.13.x locations of the registries) are required.

For each such requirement, Terraform will perform the following steps:

  • Try finding the plugin on the local machine, in the following directories
    • Current directory: . (used mainly for plugin development)
    • Same directory where terraform binary is located
    • terraform.d/plugins
    • .terraform.d/plugins
    • ~/.terraform.d/plugins
  • If the plugin binary is not available in any of those locations, try downloading the plugin from the registry, storing it in .terraform.d/plugins

Once terraform init was successful, terraform plan/apply/destroy will be searching for needed plugins in the same directories as the terraform init would search.

What happens if a provider is not available through a registry?

Performing terraform init is a really convenient way to install all providers needed by your Terraform project, provided that your provider is available using the provider registry protocol. If that is not the case, things are getting uggly.

When your Terraform project depends on a community provider that can't be downloaded with terraform init, one has to somehow obtain binary of the provider and put it in the correct path for Terraform to find it.

This is very tedious and error prone manual process that has to be repeated for every provider and every location where Terraform is executed.

To make this easier, one can use one of the relative paths to the root Terraform module (such as terraform.d/plugins or .terraform.d/plugins) and check them in together with the terraform code into source control version.

This leads to repeatable builds with the manual task being done only once, but it also means that the source control contains all the binaries of the providers (each of them being megabytes in size) - working with such source repositories can be very daunting.

Provider Shim saves the day

Instead of checking in the binary of a provider, we propose checking in a so called Provider Shim.

Shim is a small Bash script that will check if there is a copy of the provider binary on the local disk, if not, it will download the binary from GitHub and after the binary is available it will start the binary.

Such a shim would be small in size (less than 2 kilobytes) is a Bash script, making it easy to review and debug.

An example of such a Shim looks like this:

#!/usr/bin/env bash
#
# Generated by generate-terraform-provider-shim: https://github.com/numtide/generate-terraform-provider-shim
#

set -e -o pipefail

plugin_url="https://github.com/numtide/terraform-provider-linuxbox/releases/download/v0.2.2/terraform-provider-linuxbox_v0.2.2_linux_amd64.tar.gz"
plugin_unpack_dir="${XDG_CACHE_HOME:-$HOME/.cache}/terraform-providers/linuxbox_v0.2.2"
plugin_binary_name="terraform-provider-linuxbox_v0.2.2"
plugin_binary_path="${plugin_unpack_dir}/${plugin_binary_name}"
plugin_binary_sha1="7232dbb6760d34e844ce731226b9eec67c5bb276"

if [[ ! -d "${plugin_unpack_dir}" ]]; then
    mkdir -p "${plugin_unpack_dir}"
fi

if [[ -f "${plugin_binary_path}" ]]; then
    current_sha=$(git hash-object "${plugin_binary_path}")
    if [[ $current_sha != "${plugin_binary_sha1}" ]]; then
        rm "${plugin_binary_path}"
    fi
fi

if [[ ! -f "${plugin_binary_path}" ]]; then
    curl -sL "${plugin_url}" | tar xzvfC - "${plugin_unpack_dir}"
    chmod 755 "${plugin_binary_path}"
fi

current_sha=$(git hash-object "${plugin_binary_path}")
if [[ $current_sha != "${plugin_binary_sha1}" ]]; then
    echo "plugin binary sha does not match ${current_sha} != ${plugin_binary_sha1}" >&2
    exit 1
fi

exec "${plugin_binary_path}" $@

What does the Provider Shim do?

Provider Shim performs the following operations:

  • Check if the binary of the provider is available in ~/.cache/terraform-providers.
  • If there is no binary available, use curl to fetch an archive of the binary from the release in GitHub.
  • When fetched, extract the binary from the archive.
  • Check the integrity of the binary against a known SHA1 of the binary. This step will detect if someone has replaced the binary of the provider in the GitHub release or on the local disk.
  • If the SHA1 matches, exec the provider giving it the same ARGs that shim has received, which will replace the bash process with the process of the provider binary.

What happens when Terraform finds the Provider Shim at the right place?

Once terraform executes the Provider Shim instead of the provider, Provider Shim will (if needed) download the binary of the provider and start the provider.

All of this is transparent for Terraform, as if the provider binary was directly executed.

The only noticeable difference is the wait time for the fetching of the provider over the network using curl. This happens only once, after that provider binary is cached on the local disk and won't be downloaded again.

Generating Provider Shims

Generating such Provider Shims manually is a repetitive task that can be easily automated. For this purpose, we have implemented a command line utility to generate such shims.

In order to generate shims for your terraform project, execute generate-terraform-provider-shim <provider path> in the directory of your root Terraform module. Required <provider path> argument is the <owner>/<project> GitHub path of the project of the provider.

By default, generate-terraform-provider-shim will find latest release of the provider and generate a shim for it in terraform.d/ directory for each arch supported by the provider.

If a specific version is required, an argument --version=<semver matcher> can be provided.

Generated Provider Shim can (and should be checked in together with the Terraform code)

Since version 0.2.0, the shim generator will generate shims in proper paths for both Terraform 0.12.x and 0.13.x, making it a great tool for a smooth transition to Terraform 0.13.x.

Limitations

Like every hack, Provider Shims come with a set of limitations. We are aware of the following constraints for using Provider Shims:

Dependencies

Due to it's nature, Provider Shims have number of dependencies that have to be installed on the system in order for it to work. Fortunately, most of those dependencies are available on many Unix-like systems.

Here is the list of dependencies:

  • bash: Provider Shims are bash scripts relying on bash internal commands.
  • curl: used for fetching archives of the community provider.
  • unzip or gzip/tar: depending on the archive type used in the provider release, either unzip or tar/gzip are required.
  • git: we took an unorthodox approach to use git internal command to calculate SHA1 of the provider binary. Rational behind this: most of the time Terraform projects are stored in git repositories, hence git will be available.

Only providers with binaries attached to the GitHub releases are supported

We are relying on the developers of the provider to create releases with attached compiled binaries of the providers for different architectures. If that is not the case, a Provider Shim cannot be generated.

Only .tar.gz and .zip archives are supported

There is no standard way of packaging providers. Most of the time they are packaged in a Zip or Gzipped Tar archive - those are formats we are supporting.

Windows is not supported

We do not have access to a Windows machine and have never run terraform in a Windows environment, hence the generated shims will definitely not work under Windows.

Can't be executed in Terraform Cloud

Since VMs used in Terraform cloud are lacking numerous dependencies (most notably: curl), Provider Shims cannot be used in Remote execution mode of Terraform Cloud.

Conclusion

Provider Shims can be very useful in Terraform 0.12.x world and also can be useful for the transition period of Terraform 0.13.x until community providers start making their providers available through their own registries.