* Allow creation of dynamic tags * Extend project factory and related modules to support dynamic values * Extend folder and organization modules * project and organization readme * Simplify dynamic tag support and remove unnecessary restrictions • Schemas & Validations: Removed the restriction that forbade combining IAM fields with allowed_values_regex on tags. Updated validations in project and organization modules, and simplified all relevant JSON schemas. • Module Tag Bindings: Simplified the tag_value assignment in folder , project , gcs , bigquery-dataset , and kms modules by removing the defensive can(regex(...)) check and calling templatestring directly. • Outputs: Removed the tags_dynamic output from project and organization modules, as the same information is now available in tag_keys . • Project Factory: Updated tag_vars_projects in projects.tf to use the native namespaced_name attribute and filtered manually for dynamic tags. * fix(organization, project): fix linting and tests for dynamic tag support - Align allowed_values_regex and description extraction in _tags_merged locals to use lookup() for consistency with other fields. - Fix spacing in project context variable (alphabetical ordering). - Update organization tags test to include the new cost_center tag key with allowed_values_regex. - Update project tags test to include the new cost_center tag key and reflect the resolved allowed_values_regex on environment. * refactor(gcs): refine tag bindings and fix context test - Add _tag_bindings local to pre-resolve context references, enabling templatestring to receive a direct map reference (required by Terraform). - Use var.context.tag_vars instead of the non-existent local.ctx.tag_vars. - Fix HCL syntax in context.tfvars (escaped inner quotes). - Update context test inventory to reflect 3 tag bindings including a dynamic value resolved via templatestring. * refactor: align modules with tag binding context pattern - Add _tag_bindings local + templatestring dance to cloud-run-v2, compute-vm, folder, kms modules (bigquery-dataset already had it) - Exclude tag_vars from local.ctx in cloud-run-v2, compute-vm, folder, kms, project modules (bigquery-dataset already had it) - Add tag_vars to context variable in cloud-run-v2, compute-vm modules (others already had it) - Update all context tests with dynamic tag binding values using var.context.tag_vars * docs: add module-level tftest.yaml test instructions to GEMINI.md * docs: regenerate READMEs after tag-regex alignment - Regenerate variable tables in 7 module READMEs to reflect line number shifts from prior tag-regex changes - Add tag_vars exclusion to gcs ctx local - Fix whitespace alignment in iam-service-account and project-factory tag_vars blocks - Update tftest resource counts for organization and project - Remove tags_dynamic from organization/project output tables * fix(project-factory): update test inventory for tag_bindings module split - Move tag binding address from folder-2 to folder-2-iam in test inventory (tag_bindings moved from creation to IAM modules) - Update module instance count from 34 to 35 - Regenerate README tables after terraform fmt line shifts - Apply terraform fmt to variables.tf * refactor(project-factory): remove unnecessary depends_on from folder-iam modules Folder IAM modules depend on their own folder creation modules, not on module.projects. The explicit depends_on was leftover from an earlier design. * FAST stages * Address review comments. - FAST Stages: - Added tag_keys to output-files.tf in 0-org-setup to pass org tags via tfvars. - Sorted tag_keys and tag_values in output-files.tf. - Updated project-factory, networking, and security stages to use tag_keys. - Filtered tag_keys for dynamic tags only. - Modules: - Excluded tag_vars from local.ctx in iam-service-account and organization. - Simplified tag_value in iam-service-account. - Tests: - Updated test inventories for 0-org-setup and project-factory. * Fix tf format * Fix tfdoc * docs: add ADR for templatestring vars convention and update status of base path ADR * More tfdoc * Update schemas * Use endswith in context loop * Address review * Update FAST readmes * Update last modules * Terraform fmt * Revert alloydb * Fix whitespace --------- Co-authored-by: Ludovico Magnocavallo <ludo@qix.it>
16 KiB
Google Cloud Bigquery Module
This module allows managing a single BigQuery dataset, including access configuration, tables and views.
- Simple dataset with access configuration
- IAM roles
- Authorized Views, Datasets, and Routines
- Dataset options
- Tables, views and routines
- Tag bindings
- TODO
- Variables
- Outputs
Simple dataset with access configuration
Access configuration defaults to using the separate google_bigquery_dataset_access resource, so as to leave the default dataset access rules untouched.
You can choose to manage the google_bigquery_dataset access rules instead via the dataset_access variable, but be sure to always have at least one OWNER access and to avoid duplicating accesses, or terraform apply will fail.
The access variables are split into access and access_identities variables, so that dynamic values can be passed in for identities (eg a service account email generated by a different module or resource).
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
access = {
reader-group = { role = "READER", type = "group" }
owner = { role = "OWNER", type = "user" }
project_owners = { role = "OWNER", type = "special_group" }
view_1 = { role = "READER", type = "view" }
}
access_identities = {
reader-group = "playground-test@ludomagno.net"
owner = "ludo@ludomagno.net"
project_owners = "projectOwners"
view_1 = "my-project|my_dataset|my-table"
}
}
# tftest modules=1 resources=5 inventory=simple.yaml
IAM roles
Access configuration can also be specified via IAM instead of basic roles via the iam variable. When using IAM, basic roles cannot be used via the access family variables.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
iam = {
"roles/bigquery.dataOwner" = ["user:user1@example.org"]
}
iam_bindings = {
reader_user = {
role = "roles/bigquery.dataViewer"
members = ["user:user2@example.org"]
}
}
}
# tftest modules=1 resources=3 inventory=iam.yaml
Authorized Views, Datasets, and Routines
You can specify authorized views, datasets, and routines via the authorized_views, authorized_datasets and authorized_routines variables, respectively.
// Create private BigQuery dataset that will not be publicly accessible, except via the authorized BigQuery resources
module "bigquery-dataset-private" {
source = "./fabric/modules/bigquery-dataset"
project_id = "private_project"
id = "private_dataset"
authorized_views = [
{
project_id = "auth_view_project"
dataset_id = "auth_view_dataset"
table_id = "auth_view"
}
]
authorized_datasets = [
{
project_id = "auth_dataset_project"
dataset_id = "auth_dataset"
}
]
authorized_routines = [
{
project_id = "auth_routine_project"
dataset_id = "auth_routine_dataset"
routine_id = "auth_routine"
}
]
}
// Create authorized view in a public dataset
module "bigquery-authorized-views-dataset-public" {
source = "./fabric/modules/bigquery-dataset"
project_id = "auth_view_project"
id = "auth_view_dataset"
views = {
auth_view = {
friendly_name = "Public"
labels = {}
query = "SELECT * FROM `private_project.private_dataset.private_table`"
use_legacy_sql = false
deletion_protection = true
}
}
}
// Create public authorized dataset
module "bigquery-authorized-dataset-public" {
source = "./fabric/modules/bigquery-dataset"
project_id = "auth_dataset_project"
id = "auth_dataset"
}
// Create public authorized routine
module "bigquery-authorized-authorized-routine-dataset-public" {
source = "./fabric/modules/bigquery-dataset"
project_id = "auth_routine_project"
id = "auth_routine_dataset"
}
resource "google_bigquery_routine" "public-routine" {
project = "private_project"
dataset_id = module.bigquery-authorized-authorized-routine-dataset-public.dataset_id
routine_id = "auth_routine"
routine_type = "TABLE_VALUED_FUNCTION"
language = "SQL"
definition_body = <<-EOS
SELECT 1 + value AS value
EOS
arguments {
name = "value"
argument_kind = "FIXED_TYPE"
data_type = jsonencode({ "typeKind" = "INT64" })
}
return_table_type = jsonencode({ "columns" = [
{ "name" = "value", "type" = { "typeKind" = "INT64" } },
] })
}
# tftest modules=4 resources=9 inventory=authorized_resources.yaml
Authorized views can be specified both using the standard access options and the authorized_views blocks. The example configuration below uses both blocks, and will create a dataset with three authorized views view_id_1, view_id_2, and view_id_3.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
authorized_views = [
{
project_id = "view_project"
dataset_id = "view_dataset"
table_id = "view_id_1"
},
{
project_id = "view_project"
dataset_id = "view_dataset"
table_id = "view_id_2"
}
]
access = {
view_2 = { role = "READER", type = "view" }
view_3 = { role = "READER", type = "view" }
}
access_identities = {
view_2 = "view_project|view_dataset|view_id_2"
view_3 = "view_project|view_dataset|view_id_3"
}
}
# tftest modules=1 resources=4 inventory=authorized_resources_views.yaml
Dataset options
Dataset options are set via the options variable. all options must be specified, but a null value can be set to options that need to use defaults.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
options = {
default_table_expiration_ms = 3600000
default_partition_expiration_ms = null
delete_contents_on_destroy = false
max_time_travel_hours = 168
}
}
# tftest modules=1 resources=1 inventory=options.yaml
Tables, views and routines
Tables are created via the tables variable. Support for external tables will be added in a future release.
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tables = {
countries = {
friendly_name = "Countries"
schema = local.countries_schema
deletion_protection = true
}
}
}
# tftest modules=1 resources=2 inventory=tables.yaml
If partitioning is needed, populate the partitioning variable using either the time or range attribute.
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tables = {
table_a = {
deletion_protection = true
friendly_name = "Table a"
schema = local.countries_schema
partitioning = {
time = { type = "DAY", expiration_ms = null }
}
}
}
}
# tftest modules=1 resources=2 inventory=partitioning.yaml
To create views use the views variable. If you're querying a table created by the same module terraform apply will initially fail and eventually succeed once the underlying table has been created. You can probably also use the module's output in the view's query to create a dependency on the table.
locals {
countries_schema = jsonencode([
{ name = "country", type = "STRING" },
{ name = "population", type = "INT64" },
])
population_schema = [
{
name = "total",
type = "INT64",
description = "Total population"
}
]
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tables = {
countries = {
friendly_name = "Countries"
schema = local.countries_schema
deletion_protection = true
}
}
views = {
population = {
friendly_name = "Population"
query = "SELECT SUM(population) AS total FROM my_dataset.countries"
schema = local.population_schema
use_legacy_sql = false
deletion_protection = true
}
}
}
# tftest modules=1 resources=3 inventory=views.yaml
To create routines use the routines variable.
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
routines = {
custom_masking_routine = {
routine_type = "SCALAR_FUNCTION"
language = "SQL"
data_governance_type = "DATA_MASKING"
definition_body = "SAFE.REGEXP_REPLACE(ssn, '[0-9]', 'X')"
return_type = "{\"typeKind\" : \"STRING\"}"
arguments = {
ssn = {
data_type = "{\"typeKind\" : \"STRING\"}"
}
}
}
}
}
# tftest modules=1 resources=2 inventory=routines.yaml
Tag bindings
Refer to the Creating and managing tags documentation for details on usage.
module "org" {
source = "./fabric/modules/organization"
organization_id = var.organization_id
tags = {
environment = {
description = "Environment specification."
values = {
dev = {}
prod = {}
sandbox = {}
}
}
}
}
module "bigquery-dataset" {
source = "./fabric/modules/bigquery-dataset"
project_id = "my-project"
id = "my_dataset"
tag_bindings = {
env-sandbox = module.org.tag_values["environment/sandbox"].id
}
}
# tftest modules=2 resources=6
TODO
- check for dynamic values in tables and views
- add support for external tables
Variables
| name | description | type | required | default |
|---|---|---|---|---|
| id | Dataset id. | string |
✓ | |
| project_id | Id of the project where datasets will be created. | string |
✓ | |
| access | Map of access rules with role and identity type. Keys are arbitrary and must match those in the access_identities variable, types are domain, group, special_group, user, view. |
map(object({…})) |
{} |
|
| access_identities | Map of access identities used for basic access roles. View identities have the format 'project_id|dataset_id|table_id'. | map(string) |
{} |
|
| authorized_datasets | An array of datasets to be authorized on the dataset. | list(object({…})) |
[] |
|
| authorized_routines | An array of routines to be authorized on the dataset. | list(object({…})) |
[] |
|
| authorized_views | An array of views to be authorized on the dataset. | list(object({…})) |
[] |
|
| context | Context-specific interpolations. | object({…}) |
{} |
|
| dataset_access | Set access in the dataset resource instead of using separate resources. | bool |
false |
|
| description | Optional description. | string |
"Terraform managed." |
|
| encryption_key | Self link of the KMS key that will be used to protect destination table. | string |
null |
|
| friendly_name | Dataset friendly name. | string |
null |
|
| iam | IAM bindings in {ROLE => [MEMBERS]} format. Mutually exclusive with the access_* variables used for basic roles. | map(list(string)) |
{} |
|
| iam_bindings | Authoritative IAM bindings in {KEY => {role = ROLE, members = [], condition = {}}}. Keys are arbitrary. | map(object({…})) |
{} |
|
| iam_bindings_additive | Individual additive IAM bindings. Keys are arbitrary. | map(object({…})) |
{} |
|
| iam_by_principals | Authoritative IAM binding in {PRINCIPAL => [ROLES]} format. Principals need to be statically defined to avoid errors. Merged internally with the iam variable. |
map(list(string)) |
{} |
|
| labels | Dataset labels. | map(string) |
{} |
|
| location | Dataset location. | string |
"EU" |
|
| materialized_views | Materialized views definitions. | map(object({…})) |
{} |
|
| options | Dataset options. | object({…}) |
{} |
|
| routines | Routine definitions. | map(object({…})) |
{} |
|
| tables | Table definitions. Options and partitioning default to null. Partitioning can only use range or time, set the unused one to null. |
map(object({…})) |
{} |
|
| tag_bindings | Tag bindings for this dataset, in key => tag value id format. | map(string) |
{} |
|
| views | View definitions. | map(object({…})) |
{} |
Outputs
| name | description | sensitive |
|---|---|---|
| dataset | Dataset resource. | |
| dataset_id | Dataset id. | |
| id | Fully qualified dataset id. | |
| materialized_view_ids | Map of fully qualified materialized view ids keyed by view ids. | |
| materialized_views | Materialized view resources. | |
| routine_ids | Map of fully qualified routine ids keyed by routine ids. | |
| routines | Routine resources. | |
| self_link | Dataset self link. | |
| table_ids | Map of fully qualified table ids keyed by table ids. | |
| tables | Table resources. | |
| view_ids | Map of fully qualified view ids keyed by view ids. | |
| views | View resources. |