Add containerd_config support to gke-nodepool (#3973)
* Add ephemeral_storage_local_ssd_config support to modules/gke-nodepool Adds ephemeral_storage_local_ssd_count to node_config variable and the corresponding dynamic ephemeral_storage_local_ssd_config block in the node pool resource, enabling use of local SSDs as ephemeral storage. * feat(gke-nodepool): add flex_start support to node_config Add `flex_start` as an optional bool to the `node_config` variable type and wire it through to the `google_container_node_pool` resource's node_config block. This enables DWS (Dynamic Workload Scheduler) flex-start mode for node pools, used for on-demand capacity access without requiring ProvisioningRequest objects (e.g. spot TPU pools). * feat(gke-nodepool): add flex_start support to node_config Add `flex_start` as an optional bool to the `node_config` variable type and wire it through to the `google_container_node_pool` resource's node_config block. This enables DWS (Dynamic Workload Scheduler) flex-start mode for node pools, which allows the Cluster Autoscaler to request capacity on-demand without requiring ProvisioningRequest objects (unlike queued_provisioning). Typical use case is spot TPU node pools. * feat(gke-nodepool): add advanced_machine_features support to node_config Add `advanced_machine_features` as an optional object to the `node_config` variable type and wire it through to the `google_container_node_pool` resource via a dynamic block. This allows callers to configure `threads_per_core` (e.g. set to 1 to disable hyperthreading) and `enable_nested_virtualization` for node pools that require fine-grained CPU threading control or nested hypervisor support. GKE auto-sets `advanced_machine_features` (threads_per_core=1) on ct6e/TPU machine types; exposing this field also lets consumers add it to ignore_changes in their own lifecycle blocks to avoid forced replacements. * feat(gke-nodepool): add containerd_config support to node_config Add `containerd_config` as an optional object to the `node_config` variable and wire it through to the `google_container_node_pool` resource via a dynamic block. This allows callers to configure private registry mirrors or custom containerd registry hosts per node pool — useful for air-gapped environments and internal registry proxies. The `registry_hosts` list maps each upstream server to one or more mirror hosts, with optional `capabilities`, `override_path`, and `dial_timeout` fields (all defaulting to sensible values). * refactor(gke-nodepool): use maps for containerd_config registry_hosts and hosts Convert registry_hosts and hosts from lists to maps so that the registry server and host URLs serve as stable keys, avoiding index-shifting issues with for_each. Add default values for capabilities, override_path, and dial_timeout. Update README example and test inventory accordingly. * Remove default values from containerd_config hosts fields Leave capabilities, override_path, and dial_timeout without defaults so the provider/API picks them rather than the module imposing values. * Refine containerd_config variable interface - Simplify header to optional(map(list(string))) - Flatten ca, client cert/key to strings with descriptive names - Derive private_registry_access_config enabled from ca domain config list - Simplify writable_cgroups to optional(bool) - Flatten gcp_secret_manager_certificate_config to string - Remove redundant defaults where try() handles null in main.tf - Fix long lines in main.tf to stay within 79-char limit - Update copyright year to 2026 in inventory files * fix(gke-nodepool): run terraform fmt to fix attribute alignment in containerd_config * docs(gke-nodepool): regenerate README with updated variable line numbers * fix(gke-nodepool): use coalesce instead of try for null header map in for_each * tests(gke-nodepool): update containerd-config inventory to match actual plan output --------- Co-authored-by: Julio Castillo <jccb@google.com>
This commit is contained in:
@@ -230,7 +230,34 @@ module "cluster-1-nodepool-advanced-machine-features" {
|
||||
}
|
||||
}
|
||||
}
|
||||
# tftest modules=1 resources=1
|
||||
# tftest modules=1 resources=1 inventory=advanced-machine-features.yaml
|
||||
```
|
||||
|
||||
### Containerd registry mirror configuration
|
||||
|
||||
This example shows how to configure a private registry mirror for containerd on each node, useful for air-gapped environments or when pulling images through an internal registry proxy.
|
||||
|
||||
```hcl
|
||||
module "cluster-1-nodepool-containerd" {
|
||||
source = "./fabric/modules/gke-nodepool"
|
||||
project_id = "myproject"
|
||||
cluster_name = "cluster-1"
|
||||
location = "europe-west4-a"
|
||||
name = "nodepool-containerd"
|
||||
node_config = {
|
||||
machine_type = "n2-standard-4"
|
||||
containerd_config = {
|
||||
registry_hosts = {
|
||||
"registry.example.com" = {
|
||||
hosts = {
|
||||
"mirror.example.com" = {}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
# tftest modules=1 resources=1 inventory=containerd-config.yaml
|
||||
```
|
||||
<!-- BEGIN TFDOC -->
|
||||
## Variables
|
||||
@@ -239,7 +266,7 @@ module "cluster-1-nodepool-advanced-machine-features" {
|
||||
|---|---|:---:|:---:|:---:|
|
||||
| [cluster_name](variables.tf#L23) | Cluster name. | <code>string</code> | ✓ | |
|
||||
| [location](variables.tf#L48) | Cluster location. | <code>string</code> | ✓ | |
|
||||
| [project_id](variables.tf#L229) | Cluster project id. | <code>string</code> | ✓ | |
|
||||
| [project_id](variables.tf#L251) | Cluster project id. | <code>string</code> | ✓ | |
|
||||
| [cluster_id](variables.tf#L17) | Cluster id. Optional, but providing cluster_id is recommended to prevent cluster misconfiguration in some of the edge cases. | <code>string</code> | | <code>null</code> |
|
||||
| [gke_version](variables.tf#L28) | Kubernetes nodes version. Ignored if auto_upgrade is set in management_config. | <code>string</code> | | <code>null</code> |
|
||||
| [k8s_labels](variables.tf#L34) | Kubernetes labels applied to each node. | <code>map(string)</code> | | <code>{}</code> |
|
||||
@@ -248,15 +275,15 @@ module "cluster-1-nodepool-advanced-machine-features" {
|
||||
| [name](variables.tf#L59) | Optional nodepool name. | <code>string</code> | | <code>null</code> |
|
||||
| [network_config](variables.tf#L65) | Network configuration. | <code>object({…})</code> | | <code>null</code> |
|
||||
| [node_config](variables.tf#L89) | Node-level configuration. | <code>object({…})</code> | | <code>{}</code> |
|
||||
| [node_count](variables.tf#L175) | Number of nodes per instance group. Initial value can only be changed by recreation, current is ignored when autoscaling is used. | <code>object({…})</code> | | <code>{…}</code> |
|
||||
| [node_locations](variables.tf#L187) | Node locations. | <code>list(string)</code> | | <code>null</code> |
|
||||
| [nodepool_config](variables.tf#L193) | Nodepool-level configuration. | <code>object({…})</code> | | <code>null</code> |
|
||||
| [reservation_affinity](variables.tf#L234) | Configuration of the desired reservation which instances could take capacity from. | <code>object({…})</code> | | <code>null</code> |
|
||||
| [resource_manager_tags](variables.tf#L244) | A map of resource manager tag keys and values to be attached to the nodes for managing Compute Engine firewalls using Network Firewall Policies. | <code>map(string)</code> | | <code>null</code> |
|
||||
| [service_account](variables.tf#L250) | Nodepool service account. If this variable is set to null, the default GCE service account will be used. If set and email is null, a service account will be created. If scopes are null a default will be used. | <code>object({…})</code> | | <code>{}</code> |
|
||||
| [sole_tenant_nodegroup](variables.tf#L262) | Sole tenant node group. | <code>string</code> | | <code>null</code> |
|
||||
| [tags](variables.tf#L268) | Network tags applied to nodes. | <code>list(string)</code> | | <code>null</code> |
|
||||
| [taints](variables.tf#L274) | Kubernetes taints applied to all nodes. | <code>map(object({…}))</code> | | <code>{}</code> |
|
||||
| [node_count](variables.tf#L197) | Number of nodes per instance group. Initial value can only be changed by recreation, current is ignored when autoscaling is used. | <code>object({…})</code> | | <code>{…}</code> |
|
||||
| [node_locations](variables.tf#L209) | Node locations. | <code>list(string)</code> | | <code>null</code> |
|
||||
| [nodepool_config](variables.tf#L215) | Nodepool-level configuration. | <code>object({…})</code> | | <code>null</code> |
|
||||
| [reservation_affinity](variables.tf#L256) | Configuration of the desired reservation which instances could take capacity from. | <code>object({…})</code> | | <code>null</code> |
|
||||
| [resource_manager_tags](variables.tf#L266) | A map of resource manager tag keys and values to be attached to the nodes for managing Compute Engine firewalls using Network Firewall Policies. | <code>map(string)</code> | | <code>null</code> |
|
||||
| [service_account](variables.tf#L272) | Nodepool service account. If this variable is set to null, the default GCE service account will be used. If set and email is null, a service account will be created. If scopes are null a default will be used. | <code>object({…})</code> | | <code>{}</code> |
|
||||
| [sole_tenant_nodegroup](variables.tf#L284) | Sole tenant node group. | <code>string</code> | | <code>null</code> |
|
||||
| [tags](variables.tf#L290) | Network tags applied to nodes. | <code>list(string)</code> | | <code>null</code> |
|
||||
| [taints](variables.tf#L296) | Kubernetes taints applied to all nodes. | <code>map(object({…}))</code> | | <code>{}</code> |
|
||||
|
||||
## Outputs
|
||||
|
||||
|
||||
@@ -345,5 +345,100 @@ resource "google_container_node_pool" "nodepool" {
|
||||
threads_per_core = var.node_config.advanced_machine_features.threads_per_core
|
||||
}
|
||||
}
|
||||
dynamic "containerd_config" {
|
||||
for_each = var.node_config.containerd_config != null ? [""] : []
|
||||
content {
|
||||
dynamic "private_registry_access_config" {
|
||||
for_each = try(var.node_config.containerd_config.private_registry_access_config, null) != null ? [""] : []
|
||||
content {
|
||||
enabled = (
|
||||
length(try(
|
||||
var.node_config.containerd_config
|
||||
.private_registry_access_config
|
||||
.certificate_authority_domain_config,
|
||||
[]
|
||||
)) > 0
|
||||
)
|
||||
dynamic "certificate_authority_domain_config" {
|
||||
for_each = try(
|
||||
var.node_config.containerd_config
|
||||
.private_registry_access_config
|
||||
.certificate_authority_domain_config,
|
||||
[]
|
||||
)
|
||||
content {
|
||||
fqdns = certificate_authority_domain_config.value.fqdns
|
||||
gcp_secret_manager_certificate_config {
|
||||
secret_uri = (
|
||||
certificate_authority_domain_config.value
|
||||
.gcp_secret_manager_certificate_config_secret_uri
|
||||
)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
dynamic "writable_cgroups" {
|
||||
for_each = var.node_config.containerd_config.writable_cgroups != null ? [""] : []
|
||||
content {
|
||||
enabled = var.node_config.containerd_config.writable_cgroups
|
||||
}
|
||||
}
|
||||
dynamic "registry_hosts" {
|
||||
for_each = try(var.node_config.containerd_config.registry_hosts, {})
|
||||
content {
|
||||
server = registry_hosts.key
|
||||
dynamic "hosts" {
|
||||
for_each = registry_hosts.value.hosts
|
||||
content {
|
||||
host = hosts.key
|
||||
capabilities = hosts.value.capabilities
|
||||
override_path = hosts.value.override_path
|
||||
dial_timeout = hosts.value.dial_timeout
|
||||
dynamic "header" {
|
||||
for_each = coalesce(hosts.value.header, {})
|
||||
content {
|
||||
key = header.key
|
||||
value = header.value
|
||||
}
|
||||
}
|
||||
dynamic "ca" {
|
||||
for_each = (
|
||||
hosts.value.ca_gcp_secret_manager_secret_uri != null
|
||||
? [hosts.value.ca_gcp_secret_manager_secret_uri]
|
||||
: []
|
||||
)
|
||||
content {
|
||||
gcp_secret_manager_secret_uri = ca.value
|
||||
}
|
||||
}
|
||||
dynamic "client" {
|
||||
for_each = (
|
||||
hosts.value.client != null ? [hosts.value.client] : []
|
||||
)
|
||||
content {
|
||||
cert {
|
||||
gcp_secret_manager_secret_uri = (
|
||||
client.value.cert_gcp_secret_manager_secret_uri
|
||||
)
|
||||
}
|
||||
dynamic "key" {
|
||||
for_each = (
|
||||
client.value.key_gcp_secret_manager_secret_uri != null
|
||||
? [client.value.key_gcp_secret_manager_secret_uri]
|
||||
: []
|
||||
)
|
||||
content {
|
||||
gcp_secret_manager_secret_uri = key.value
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -149,6 +149,28 @@ variable "node_config" {
|
||||
enable_nested_virtualization = optional(bool)
|
||||
threads_per_core = optional(number)
|
||||
}))
|
||||
containerd_config = optional(object({
|
||||
private_registry_access_config = optional(object({
|
||||
certificate_authority_domain_config = optional(list(object({
|
||||
fqdns = list(string)
|
||||
gcp_secret_manager_certificate_config_secret_uri = string
|
||||
})))
|
||||
}))
|
||||
writable_cgroups = optional(bool)
|
||||
registry_hosts = optional(map(object({
|
||||
hosts = optional(map(object({
|
||||
capabilities = optional(list(string))
|
||||
override_path = optional(bool)
|
||||
dial_timeout = optional(string)
|
||||
header = optional(map(list(string)))
|
||||
ca_gcp_secret_manager_secret_uri = optional(string)
|
||||
client = optional(object({
|
||||
cert_gcp_secret_manager_secret_uri = string
|
||||
key_gcp_secret_manager_secret_uri = optional(string)
|
||||
}))
|
||||
})), {})
|
||||
})), {})
|
||||
}))
|
||||
})
|
||||
default = {}
|
||||
nullable = false
|
||||
|
||||
@@ -0,0 +1,28 @@
|
||||
# Copyright 2026 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
values:
|
||||
module.cluster-1-nodepool-advanced-machine-features.google_container_node_pool.nodepool:
|
||||
cluster: cluster-1
|
||||
location: europe-west4-a
|
||||
name: nodepool-advanced-machine-features
|
||||
project: myproject
|
||||
node_config:
|
||||
- machine_type: n2-standard-4
|
||||
advanced_machine_features:
|
||||
- threads_per_core: 1
|
||||
enable_nested_virtualization: null
|
||||
|
||||
counts:
|
||||
google_container_node_pool: 1
|
||||
38
tests/modules/gke_nodepool/examples/containerd-config.yaml
Normal file
38
tests/modules/gke_nodepool/examples/containerd-config.yaml
Normal file
@@ -0,0 +1,38 @@
|
||||
# Copyright 2026 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
values:
|
||||
module.cluster-1-nodepool-containerd.google_container_node_pool.nodepool:
|
||||
cluster: cluster-1
|
||||
location: europe-west4-a
|
||||
name: nodepool-containerd
|
||||
project: myproject
|
||||
node_config:
|
||||
- machine_type: n2-standard-4
|
||||
containerd_config:
|
||||
- private_registry_access_config: []
|
||||
writable_cgroups: []
|
||||
registry_hosts:
|
||||
- server: registry.example.com
|
||||
hosts:
|
||||
- host: mirror.example.com
|
||||
capabilities: null
|
||||
override_path: null
|
||||
dial_timeout: null
|
||||
header: []
|
||||
ca: []
|
||||
client: []
|
||||
|
||||
counts:
|
||||
google_container_node_pool: 1
|
||||
Reference in New Issue
Block a user