# GKE Standard cluster module This module offers a way to create and manage Google Kubernetes Engine (GKE) [Standard clusters](https://cloud.google.com/kubernetes-engine/docs/concepts/choose-cluster-mode#why-standard). With its sensible default settings based on best practices and authors' experience as Google Cloud practitioners, the module accommodates for many common use cases out-of-the-box, without having to rely on verbose configuration. > [!IMPORTANT] > This module should be used together with the [`gke-nodepool`](../gke-nodepool/) module because the default node pool is deleted upon cluster creation by default. - [Cluster access configurations](#cluster-access-configurations) - [Private cluster with DNS endpoint enabled](#private-cluster-with-dns-endpoint-enabled) - [Public cluster](#public-cluster) - [Allowing access from Google Cloud services](#allowing-access-from-google-cloud-services) - [Regional cluster](#regional-cluster) - [Enable Dataplane V2](#enable-dataplane-v2) - [Managing GKE logs](#managing-gke-logs) - [Upgrade notifications](#upgrade-notifications) - [Monitoring configuration](#monitoring-configuration) - [Disable GKE logs or metrics collection](#disable-gke-logs-or-metrics-collection) - [Cloud DNS](#cloud-dns) - [Backup for GKE](#backup-for-gke) - [Automatic creation of new secondary ranges](#automatic-creation-of-new-secondary-ranges) - [Node auto-provisioning with GPUs and TPUs](#node-auto-provisioning-with-gpus-and-tpus) - [Disable PSC endpoint creation](#disable-psc-endpoint-creation) - [Variables](#variables) - [Outputs](#outputs) ## Cluster access configurations The `access_config` variable can be used to configure access to the control plane, and nodes public access. The following examples illustrate different possible configurations. ### Private cluster with DNS endpoint enabled The default module configuration creates a cluster with private nodes, no public endpoint, and access via the DNS endpoint enabled. The default variable configuration is shown in comments. Master authorized ranges can be set via the `access_config.ip_access.authorized_ranges` attribute. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" # access_config can be omitted if master authorized ranges are not needed access_config = { # defaults to true # dns_access = { # allow_external_traffic = true # } ip_access = { authorized_ranges = { internal-vms = "10.0.0.0/8" } } # private_nodes = true } vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = { pods = "pods" services = "services" } } max_pods_per_node = 32 labels = { environment = "dev" } } # tftest modules=1 resources=1 inventory=access-private.yaml ``` ### Public cluster To configure a public cluster, turn off `access_config.ip_access.disable_public_endpoint`. Nodes can be left as private or made public if needed, like in the example below. DNS endpoint is turned off here as it's probably redundant for a public cluster. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" access_config = { dns_access = { allow_external_traffic = false } ip_access = { authorized_ranges = { "corporate proxy" = "8.8.8.8/32" } gcp_public_cidrs_access_enabled = false disable_public_endpoint = false } private_nodes = false } vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = { pods = "pods" services = "services" } } max_pods_per_node = 32 labels = { environment = "dev" } } # tftest modules=1 resources=1 inventory=access-public.yaml ``` ### Allowing access from Google Cloud services To allow access to your cluster from Google Cloud services (like Cloud Shell, Cloud Build, etc.) without needing to manually specify all Google Cloud IP ranges, you can use the `gcp_public_cidrs_access_enabled` parameter: ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" access_config = { dns_access = { allow_external_traffic = false } ip_access = { authorized_ranges = { internal-vms = "10.0.0.0/8" } gcp_public_cidrs_access_enabled = true disable_public_endpoint = false } private_nodes = false } vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = { pods = "pods" services = "services" } } max_pods_per_node = 32 labels = { environment = "dev" } } # tftest modules=1 resources=1 inventory=access-google.yaml ``` ## Regional cluster Regional clusters are created by setting `location` to a GCP region and then configuring `node_locations`, as shown in the example below. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1" node_locations = ["europe-west1-b"] vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = { pods = "pods" services = "services" } } max_pods_per_node = 32 labels = { environment = "dev" } } # tftest modules=1 resources=1 inventory=regional.yaml ``` ## Enable Dataplane V2 This example shows how to [create a zonal GKE Cluster with Dataplane V2 enabled](https://cloud.google.com/kubernetes-engine/docs/how-to/dataplane-v2). ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-dataplane-v2" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} # use default names "pods" and "services" } enable_features = { dataplane_v2 = true fqdn_network_policy = true secret_manager_config = true workload_identity = true } labels = { environment = "dev" } } # tftest modules=1 resources=1 ``` ## Managing GKE logs This example shows you how to [control which logs are sent from your GKE cluster to Cloud Logging](https://cloud.google.com/stackdriver/docs/solutions/gke/installing). When you create a new GKE cluster, [Cloud Operations for GKE](https://cloud.google.com/stackdriver/docs/solutions/gke) integration with Cloud Logging is enabled by default and [System logs](https://cloud.google.com/stackdriver/docs/solutions/gke/managing-logs#what_logs) are collected. You can enable collection of several other [types of logs](https://cloud.google.com/stackdriver/docs/solutions/gke/managing-logs#what_logs). The following example enables collection of *all* optional logs. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} } logging_config = { enable_workloads_logs = true enable_api_server_logs = true enable_scheduler_logs = true enable_controller_manager_logs = true } } # tftest modules=1 resources=1 inventory=logging-config-enable-all.yaml ``` ## Upgrade notifications Upgrade notifications are configured via the `enable_features.upgrade_notifications`. An existing PubSub topic can be defined via its `topic` attribute, or a new one can be created if the attribute is not set. The `event_types` attribute can be used to control which event types are sent. The `kms_key_name` attribute can be used to control which KMS key is used to encrypt the notification messages. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} } enable_features = { upgrade_notifications = { event_types = ["SECURITY_BULLETIN_EVENT", "UPGRADE_EVENT"] kms_key_name = "projects/myproject/locations/global/keyRings/mykeyring/cryptoKeys/mykey" } } } # tftest modules=1 resources=2 inventory=notifications.yaml ``` ## Monitoring configuration This example shows how to [configure collection of Kubernetes control plane metrics](https://cloud.google.com/stackdriver/docs/solutions/gke/managing-metrics#enable-control-plane-metrics). These metrics are optional and are not collected by default. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} # use default names "pods" and "services" } monitoring_config = { enable_api_server_metrics = true enable_controller_manager_metrics = true enable_scheduler_metrics = true } } # tftest modules=1 resources=1 inventory=monitoring-config-control-plane.yaml ``` The next example shows how to [configure collection of kube state metrics](https://cloud.google.com/stackdriver/docs/solutions/gke/managing-metrics#enable-ksm). These metrics are optional and are not collected by default. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} # use default names "pods" and "services" } monitoring_config = { enable_cadvisor_metrics = true enable_daemonset_metrics = true enable_deployment_metrics = true enable_hpa_metrics = true enable_pod_metrics = true enable_statefulset_metrics = true enable_storage_metrics = true # Kube state metrics collection requires Google Cloud Managed Service for Prometheus, # which is enabled by default. # enable_managed_prometheus = true } } # tftest modules=1 resources=1 inventory=monitoring-config-kube-state.yaml ``` The *control plane metrics* and *kube state metrics* collection can be configured in a single `monitoring_config` block. ## Disable GKE logs or metrics collection > [!WARNING] > If you've disabled Cloud Logging or Cloud Monitoring, GKE customer support > is offered on a best-effort basis and might require additional effort > from your engineering team. This example shows how to fully disable logs collection on a zonal GKE Standard cluster. This is not recommended. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} } logging_config = { enable_system_logs = false } } # tftest modules=1 resources=1 inventory=logging-config-disable-all.yaml ``` The next example shows how to fully disable metrics collection on a zonal GKE Standard cluster. This is not recommended. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = "myproject" name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} } monitoring_config = { enable_system_metrics = false enable_managed_prometheus = false } } # tftest modules=1 resources=1 inventory=monitoring-config-disable-all.yaml ``` ## Cloud DNS This example shows how to [use Cloud DNS as a Kubernetes DNS provider](https://cloud.google.com/kubernetes-engine/docs/how-to/cloud-dns) for GKE Standard clusters. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = var.project_id name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} } enable_features = { dns = { provider = "CLOUD_DNS" scope = "CLUSTER_SCOPE" domain = "gke.local" } } } # tftest modules=1 resources=1 inventory=dns.yaml ``` ## Backup for GKE > [!NOTE] > Although Backup for GKE can be enabled as an add-on when configuring your GKE clusters, it is a separate service from GKE. [Backup for GKE](https://cloud.google.com/kubernetes-engine/docs/add-on/backup-for-gke/concepts/backup-for-gke) is a service for backing up and restoring workloads in GKE clusters. It has two components: - A [Google Cloud API](https://cloud.google.com/kubernetes-engine/docs/add-on/backup-for-gke/reference/rest) that serves as the control plane for the service. - A GKE add-on (the [Backup for GKE agent](https://cloud.google.com/kubernetes-engine/docs/add-on/backup-for-gke/concepts/backup-for-gke#agent_overview)) that must be enabled in each cluster for which you wish to perform backup and restore operations. This example shows how to [enable Backup for GKE on a new zonal GKE Standard cluster](https://cloud.google.com/kubernetes-engine/docs/add-on/backup-for-gke/how-to/install#enable_on_a_new_cluster_optional) and [plan a set of backups](https://cloud.google.com/kubernetes-engine/docs/add-on/backup-for-gke/how-to/backup-plan). ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = var.project_id name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = {} } backup_configs = { enable_backup_agent = true backup_plans = { "backup-1" = { region = "europe-west2" schedule = "0 9 * * 1" applications = { namespace-1 = ["app-1", "app-2"] } } } } } # tftest modules=1 resources=2 inventory=backup.yaml ``` ## Automatic creation of new secondary ranges You can use `var.vpc_config.secondary_range_blocks` to let GKE create new secondary ranges for the cluster. The example below reserves an available /14 block for pods and a /20 for services. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = var.project_id name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_blocks = { pods = "" services = "/20" # can be an empty string as well } } } # tftest modules=1 resources=1 ``` ## Node auto-provisioning with GPUs and TPUs You can use `var.cluster_autoscaling` block to configure node auto-provisioning for the GKE cluster. The example below configures limits for CPU, memory, GPUs and TPUs. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-standard" project_id = var.project_id name = "cluster-1" location = "europe-west1-b" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_blocks = { pods = "" services = "/20" } } cluster_autoscaling = { cpu_limits = { max = 48 } mem_limits = { max = 182 } # Can be GPUs or TPUs accelerator_resources = [ { resource_type = "nvidia-l4" max = 2 }, { resource_type = "tpu-v5-lite-podslice" max = 2 } ] } } # tftest modules=1 resources=1 ``` ### Disable PSC endpoint creation To disable IP access to the GKE control plane and prevent PSC endpoint creation, set `var.access_config.ip_access` to `null` or omit the variable. ```hcl module "cluster-1" { source = "./fabric/modules/gke-cluster-autopilot" project_id = "myproject" name = "cluster-1" location = "europe-west1" vpc_config = { network = var.vpc.self_link subnetwork = var.subnet.self_link secondary_range_names = { pods = "pods" services = "services" } } labels = { environment = "dev" } } # tftest modules=1 resources=1 inventory=no-ip-access.yaml ``` ## Variables | name | description | type | required | default | |---|---|:---:|:---:|:---:| | [location](variables.tf#L304) | Cluster zone or region. | string | ✓ | | | [name](variables.tf#L419) | Cluster name. | string | ✓ | | | [project_id](variables.tf#L471) | Cluster project id. | string | ✓ | | | [vpc_config](variables.tf#L482) | VPC-level configuration. | object({…}) | ✓ | | | [access_config](variables.tf#L17) | Control plane endpoint and nodes access configurations. | object({…}) | | {} | | [backup_configs](variables.tf#L49) | Configuration for Backup for GKE. | object({…}) | | {} | | [cluster_autoscaling](variables.tf#L72) | Enable and configure limits for Node Auto-Provisioning with Cluster Autoscaler. | object({…}) | | null | | [default_nodepool](variables.tf#L152) | Enable default nodepool. | object({…}) | | {} | | [deletion_protection](variables.tf#L170) | Whether or not to allow Terraform to destroy the cluster. Unless this field is set to false in Terraform state, a terraform destroy or terraform apply that would delete the cluster will fail. | bool | | true | | [description](variables.tf#L177) | Cluster description. | string | | null | | [enable_addons](variables.tf#L183) | Addons enabled in the cluster (true means enabled). | object({…}) | | {} | | [enable_features](variables.tf#L205) | Enable cluster-level features. Certain features allow configuration. | object({…}) | | {} | | [fleet_project](variables.tf#L285) | The name of the fleet host project where this cluster will be registered. | string | | null | | [issue_client_certificate](variables.tf#L291) | Enable issuing client certificate. | bool | | false | | [labels](variables.tf#L297) | Cluster resource labels. | map(string) | | {} | | [logging_config](variables.tf#L309) | Logging configuration. | object({…}) | | {} | | [maintenance_config](variables.tf#L330) | Maintenance window configuration. | object({…}) | | {…} | | [max_pods_per_node](variables.tf#L353) | Maximum number of pods per node in this cluster. | number | | 110 | | [min_master_version](variables.tf#L359) | Minimum version of the master, defaults to the version of the most recent official release. | string | | null | | [monitoring_config](variables.tf#L365) | Monitoring configuration. Google Cloud Managed Service for Prometheus is enabled by default. | object({…}) | | {} | | [node_config](variables.tf#L424) | Node-level configuration. | object({…}) | | {} | | [node_locations](variables.tf#L447) | Zones in which the cluster's nodes are located. | list(string) | | [] | | [node_pool_auto_config](variables.tf#L454) | Node pool configs that apply to auto-provisioned node pools in autopilot clusters and node auto-provisioning-enabled clusters. | object({…}) | | {} | | [release_channel](variables.tf#L476) | Release channel for GKE upgrades. | string | | null | ## Outputs | name | description | sensitive | |---|---|:---:| | [ca_certificate](outputs.tf#L17) | Public certificate of the cluster (base64-encoded). | ✓ | | [cluster](outputs.tf#L25) | Cluster resource. | ✓ | | [dns_endpoint](outputs.tf#L31) | Control plane DNS endpoint. | | | [endpoint](outputs.tf#L39) | Cluster endpoint. | | | [id](outputs.tf#L44) | FUlly qualified cluster id. | | | [location](outputs.tf#L49) | Cluster location. | | | [master_version](outputs.tf#L54) | Master version. | | | [name](outputs.tf#L59) | Cluster name. | | | [notifications](outputs.tf#L64) | GKE PubSub notifications topic. | | | [self_link](outputs.tf#L69) | Cluster self link. | ✓ | | [workload_identity_pool](outputs.tf#L75) | Workload identity pool. | |