Managing Ansible through IAP

To begin with

So IAP is a great tool to help protect access to resources in your GCP environment. And ansible is a great low-barrier-to-entry tool to manage machines in your environment. But the two systems don’t really play well together. Work has been done to make the two tools talk better, but what existed didn’t work quite right when I tried it. So here’s a few updates…

How it all works

The inventory file

Ansible can create inventories from GCP API calls. So we first create an inventory file requesting host information from all our projects:

plugin: gcp_compute
projects:
  - project1
  - project2
auth_kind: application
keyed_groups:
  - key: labels
    prefix: label
  - key: project
    prefix: project
compose:
  ansible_host: name

The link has more documentation on specific sections. The big thing that matters here is the keyed_groups and compose step. keyed_groups will matter when you go to select groups of hosts, compose matters for initial connections, so probably don’t mess with it.

The ansible.cfg file

To get this working nicely for users (so they don’t have to execute weird ansible commands) we need to set some defaults in a local ansible.cfg file:

[inventory]
enable_plugins = gcp_compute

[defaults]
inventory = inventory/gcp.yml

[ssh_connection]
# Enabling pipelining reduces the number of SSH operations required
# to execute a module on the remote server.
# This can result in a significant performance improvement
# when enabled.
pipelining = True
ssh_executable = misc/gcp-wrapper.sh
ssh_args =
# Tell ansible to use SCP for file transfers when connection is set to SSH
scp_if_ssh = True
scp_executable = misc/gcp-scp-wrapper.sh
scp_args =

It’s very important that we leave the ssh_args = and scp_args = lines in, to ensure that ansible doesn’t try and add some weird additional arguments and break our IAP request (which doesn’t support any default SSH CLI switches like -O).

The scripts

To actually leverage IAP, we need to call it correctly. That’s where our SSH/SCP override scripts come in:

gcp-wrapper.sh

#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP option to connect
# to our servers.
# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
# SC2124: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
host="${*: -2: 1}"
cmd="${*: -1: 1}"

# Unfortunately ansible has hardcoded scp options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for s_arg in "${@: 1: $# -2}" ; do
    if [[ "${s_arg}" == --* ]] ; then
        opts+=("${s_arg}")
    fi
done

exec gcloud compute ssh "${opts[@]}" "${host}" --command "${cmd}"

gcp-scp-wrapper.sh

#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP option to connect
# to our servers.
# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
# SC2124: Assigning an array to a string! Assign as array, or use * instead of @ to concatenate.
host="${*: -2: 1}"
cmd="${*: -1: 1}"

# Unfortunately ansible has hardcoded scp options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for s_arg in "${@: 1: $# -2}" ; do
    if [[ "${s_arg}" == --* ]] ; then
        opts+=("${s_arg}")
    fi
done

# Remove [] around our host, as gcloud scp doesn't understand this syntax
cmd=$(echo "${cmd}" | tr -d "[]")

exec gcloud compute scp "${opts[@]}" "${host}" "${cmd}"

These two scripts will actually make the proper ssh/scp connections to your hosts. You can debug them by adding a set -x to the script and running ansible/ansible-playbook with -vvv.

The group vars

The final step to making this all work the way we want is actually defining our full IAP commands for each host. Fortunately, we can do this dynamically with group variables.

group_vars/all.yml

ansible_ssh_common_args: "--tunnel-through-iap --zone={{ hostvars[inventory_hostname].zone }} --project={{ hostvars[inventory_hostname].project }}"
ansible_scp_extra_args: "--tunnel-through-iap --zone={{ hostvars[inventory_hostname].zone }} --project={{ hostvars[inventory_hostname].project }}"

ansible_become: yes
ansible_become_user: root
ansible_become_method: sudo

And there’s our last bit of magic. We define the zone and project for each host we connect to using the inventory data for the hosts. Nice.

Results

We end up with an Ansible set up that lets us connect through IAP to hosts just like we would using SSH:

#$ ansible-playbook -D common.yml --ask-vault-pass
Vault password:
/usr/lib/python3.10/site-packages/google/auth/_default.py:79: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. We recommend you rerun `gcloud auth application-default login` and make sure a quota project is added. Or you can use service accounts instead. For more information about service accounts, see https://cloud.google.com/docs/authentication/
  warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
/usr/lib/python3.10/site-packages/google/auth/_default.py:79: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. We recommend you rerun `gcloud auth application-default login` and make sure a quota project is added. Or you can use service accounts instead. For more information about service accounts, see https://cloud.google.com/docs/authentication/
  warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)

PLAY [install common] *******************************

TASK [Gathering Facts] ******************************************