Abhishek 58301c9eda Add containerd_config support to gke-nodepool (#3973)
* Add ephemeral_storage_local_ssd_config support to modules/gke-nodepool

Adds ephemeral_storage_local_ssd_count to node_config variable and the
corresponding dynamic ephemeral_storage_local_ssd_config block in the
node pool resource, enabling use of local SSDs as ephemeral storage.

* feat(gke-nodepool): add flex_start support to node_config

Add `flex_start` as an optional bool to the `node_config` variable type
and wire it through to the `google_container_node_pool` resource's
node_config block. This enables DWS (Dynamic Workload Scheduler)
flex-start mode for node pools, used for on-demand capacity access
without requiring ProvisioningRequest objects (e.g. spot TPU pools).

* feat(gke-nodepool): add flex_start support to node_config

Add `flex_start` as an optional bool to the `node_config` variable type
and wire it through to the `google_container_node_pool` resource's
node_config block. This enables DWS (Dynamic Workload Scheduler)
flex-start mode for node pools, which allows the Cluster Autoscaler to
request capacity on-demand without requiring ProvisioningRequest objects
(unlike queued_provisioning). Typical use case is spot TPU node pools.

* feat(gke-nodepool): add advanced_machine_features support to node_config

Add `advanced_machine_features` as an optional object to the `node_config`
variable type and wire it through to the `google_container_node_pool`
resource via a dynamic block. This allows callers to configure
`threads_per_core` (e.g. set to 1 to disable hyperthreading) and
`enable_nested_virtualization` for node pools that require fine-grained
CPU threading control or nested hypervisor support.

GKE auto-sets `advanced_machine_features` (threads_per_core=1) on
ct6e/TPU machine types; exposing this field also lets consumers add it to
ignore_changes in their own lifecycle blocks to avoid forced replacements.

* feat(gke-nodepool): add containerd_config support to node_config

Add `containerd_config` as an optional object to the `node_config` variable
and wire it through to the `google_container_node_pool` resource via a
dynamic block. This allows callers to configure private registry mirrors or
custom containerd registry hosts per node pool — useful for air-gapped
environments and internal registry proxies.

The `registry_hosts` list maps each upstream server to one or more mirror
hosts, with optional `capabilities`, `override_path`, and `dial_timeout`
fields (all defaulting to sensible values).

* refactor(gke-nodepool): use maps for containerd_config registry_hosts and hosts

Convert registry_hosts and hosts from lists to maps so that the registry
server and host URLs serve as stable keys, avoiding index-shifting issues
with for_each. Add default values for capabilities, override_path, and
dial_timeout. Update README example and test inventory accordingly.

* Remove default values from containerd_config hosts fields

Leave capabilities, override_path, and dial_timeout without defaults
so the provider/API picks them rather than the module imposing values.

* Refine containerd_config variable interface

- Simplify header to optional(map(list(string)))
- Flatten ca, client cert/key to strings with descriptive names
- Derive private_registry_access_config enabled from ca domain config list
- Simplify writable_cgroups to optional(bool)
- Flatten gcp_secret_manager_certificate_config to string
- Remove redundant defaults where try() handles null in main.tf
- Fix long lines in main.tf to stay within 79-char limit
- Update copyright year to 2026 in inventory files

* fix(gke-nodepool): run terraform fmt to fix attribute alignment in containerd_config

* docs(gke-nodepool): regenerate README with updated variable line numbers

* fix(gke-nodepool): use coalesce instead of try for null header map in for_each

* tests(gke-nodepool): update containerd-config inventory to match actual plan output

---------

Co-authored-by: Julio Castillo <jccb@google.com>
2026-05-27 10:00:26 +00:00
2026-04-18 10:07:14 +02:00
2026-04-24 20:45:45 +00:00
2026-05-25 12:27:30 +00:00
2025-10-26 11:56:41 +01:00
2024-05-28 10:53:13 +02:00
2025-10-24 13:11:17 +02:00
2025-12-04 15:15:35 +01:00
2025-05-21 10:25:45 +02:00
2026-05-25 12:27:30 +00:00
2025-12-07 10:43:25 +01:00
2026-05-25 12:27:30 +00:00
2026-05-25 12:27:30 +00:00
2026-04-18 10:07:14 +02:00
2026-04-18 10:07:14 +02:00
2019-05-03 17:58:36 -04:00
2026-05-14 08:03:35 +00:00
2023-12-22 16:23:24 +01:00

Cloud Foundation Fabric

Terraform Examples and Modules for Google Cloud

This repository provides end-to-end blueprints and a suite of Terraform modules for Google Cloud, which support different use cases:

  • organization-wide landing zone toolkit used to bootstrap real-world cloud foundations
  • a comprehensive source of lean modules that lend themselves well to changes

The whole repository is meant to be cloned as a single unit, and then forked into separate owned repositories to seed production usage, or used as-is and periodically updated as a complete toolkit for prototyping. You can read more on this approach in our contributing guide, and a comparison against similar toolkits here.

Organization toolkit (Fabric FAST)

Setting up a production-ready GCP organization is often a time-consuming process. Fabric FAST aims to speed up this process via two complementary goals. On the one hand, FAST provides a design of a GCP organization that includes the typical elements required by enterprise customers. Secondly, we provide a reference implementation of the FAST design using Terraform.

Modules

The suite of modules in this repository is designed for rapid composition and reuse, and to be reasonably simple and readable so that they can be forked and changed where the use of third-party code and sources is not allowed.

All modules share a similar interface where each module tries to stay close to the underlying provider resources, support IAM together with resource creation and modification, offer the option of creating multiple resources where it makes sense (eg not for projects), and be completely free of side-effects (eg no external commands).

The current list of modules supports most of the core foundational and networking components used to design end-to-end infrastructure, with more modules in active development for specialized compute, security, and data scenarios.

Currently available modules:

For more information and usage examples see each module's README file.

Description
No description provided
Readme 70 MiB
Languages
HCL 90.7%
Python 8.3%
Shell 0.6%
Smarty 0.3%
Dockerfile 0.1%