Reorganize ADRs and new versioning ADR (#2642)
* Reorganize ADRs and new versioning ADR * Workflow examples * Fix ADR links * Changes discussed with ludoo * Fix image reference * Update image * Fix typo * Complet decision section --------- Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
This commit is contained in:
97
adrs/20241029-versioning.md
Normal file
97
adrs/20241029-versioning.md
Normal file
@@ -0,0 +1,97 @@
|
||||
# Versioning Scheme Tied to FAST Releases
|
||||
|
||||
**authors:** [Ludo](https://github.com/ludoo), [Julio](https://github.com/jccb), [Simone](https://github.com/sruffilli) \
|
||||
**date:** Oct 29, 2024
|
||||
**last update**: Oct 30, 2024
|
||||
|
||||
## Status
|
||||
|
||||
Piloting
|
||||
|
||||
## Context
|
||||
|
||||
Our current versioning scheme releases new versions based on changes across modules. This approach was suitable when modules were the primary focus of development. However, with the increasing importance of FAST, this process no longer aligns with our priorities. We need a versioning scheme that reflects the significance of FAST releases and allows for more frequent updates to modules and documentation. The current release process wasn't designed with FAST in mind, causing friction and delaying releases.
|
||||
|
||||
## Proposal
|
||||
|
||||
Change the versioning schema as follows:
|
||||
|
||||
- **Major Release (X.0.0):** A major release is reserved for changes that introduce breaking changes to the core functionality of FAST. This means any modification that requires users to change variables or manipulate state to maintain compatibility. Removing functionality from FAST is also considered a breaking change.
|
||||
- **minor release (1.Y.0):** A minor release signifies breaking changes within individual modules or components of the project, while maintaining backward compatibility with the overall structure and purpose of the module collection.
|
||||
- **Patch Release (1.0.Z)**: Any other changes that do not introduce breaking changes, including bug fixes, performance enhancements, and new non-breaking features, constitute a patch release. These updates are backward compatible and should not require any modifications to existing Terraform configurations.
|
||||
|
||||
For the purpose of this document, a breaking change is any code change that forces any caller to update their references to the modified code. The following is a non-exhaustive list of **breaking** changes for a module:
|
||||
- Adding a new required input variable.
|
||||
- Removing or renaming an existing input variable referenced elsewhere in the codebase.
|
||||
- Adding new required fields to the type of an existing variable referenced elsewhere in the codebase.
|
||||
- Removing or renamaing an existing output referenced elsewhere in the codebase.
|
||||
- Changing the structure of an output referenced elsewhere in the codebase.
|
||||
|
||||
The following is a non-exhaustive list of **non-breaking** changes for a module:
|
||||
- Adding new optional input variables.
|
||||
- Adding new optional fields to existing input variables.
|
||||
- Adding new output variables.
|
||||
- Adding new resources do not affect existing resources.
|
||||
|
||||
### Development Workflow:
|
||||
|
||||
* **Modules and Documentation:** Changes to modules and documentation will be made directly to the `master` branch.
|
||||
* **FAST Development:** FAST development will occur in a dedicated, protected branch named `fast-dev`. All changes to `fast-dev` must be submitted via Pull Requests..
|
||||
|
||||
As shown in the diagram below, the repository will now contain two long-lived branches: `master` and `fast-dev`.
|
||||
|
||||

|
||||
|
||||
### FAST Release Process:
|
||||
|
||||
This case is highlighted in green in the the diagram above. The process is as follows:
|
||||
|
||||
1. Merge `master` into `fast-dev`. This ensures that the latest module and documentation changes are included in the FAST release.
|
||||
1. Create a PR from `fast-dev` to master. This allows for a final review of all changes included in the release and ensures that all tests pass against the release candidate.
|
||||
1. Merge the PR into `master` and tag with the new major version number (e.g., v2.0.0, v3.0.0).
|
||||
1. Create a new release from the `master` branch in GitHub as explained in the [Contributing guide](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/master/CONTRIBUTING.md#cutting-a-new-release)
|
||||
|
||||
### FAST Pre-release Process: (Red box in diagram)
|
||||
|
||||
This case is highlighted in red in the the diagram above. The process is as follows:
|
||||
|
||||
1. Merge `master` into `fast-dev`. This ensures that the latest module and documentation changes are included in the FAST release.
|
||||
1. Merge the PR into `master` and tag with the new major version number (e.g., v2.0.0, v3.0.0).
|
||||
1. Create a new pre-release from `fast-dev` in GitHub as explained in the [Contributing guide](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/master/CONTRIBUTING.md#cutting-a-new-release)
|
||||
|
||||
### Development Workflow Examples:
|
||||
|
||||
#### Scenario 1: changes that don't break FAST for existing users
|
||||
|
||||
- Start a new branch from `master`.
|
||||
- Devolop changes.
|
||||
- Open and merge a PR against master. In the description, use the `breaks-modules` (if needed) label in the PR.
|
||||
|
||||
#### Scenario 2: changes that break FAST for existing users
|
||||
|
||||
- Start a new branch from `fast-dev`.
|
||||
- Devolop changes.
|
||||
- Open and merge a PR against fast-dev. In the description, use the `breaks-Fast` (if needed) label in the PR.
|
||||
|
||||
> [!TIP]
|
||||
> Aas part of the development of your changes, we encourage merge `master` frequently into your own branch to simplify the final merge back to master.
|
||||
|
||||
## Decision
|
||||
|
||||
Pilot starting from version 35.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Clearer Versioning:** Version numbers will clearly indicate major FAST releases.
|
||||
- **Faster Module Updates:** Modules and documentation can be updated more frequently without being tied to the FAST release cycle.
|
||||
- **Improved FAST Release Process:** The dedicated fast-dev branch and PR process will lead to more stable and predictable FAST releases.
|
||||
- **Increased Development Velocity:** Decoupling module and FAST development will increase overall development velocity.
|
||||
- **Potential Learning Curve:** Developers will need to adapt to the new branching and release workflow.
|
||||
|
||||
## Implementation:
|
||||
|
||||
- Create the protected `fast-dev` branch.
|
||||
- Update documentation to reflect the new versioning scheme and release process.
|
||||
- Create new labels for PRs, update changelog generation script to account for the new labels and branches.
|
||||
|
||||
As a future improvement we can consider developing a GitHub Action for automated release creation, including tagging, release notes generation, etc.
|
||||
BIN
adrs/20241029-versioning.png
Normal file
BIN
adrs/20241029-versioning.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 94 KiB |
36
adrs/fast/0-bootstram-user-iam.md
Normal file
36
adrs/fast/0-bootstram-user-iam.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# Remove initial gcloud commands needed to bootstrap
|
||||
|
||||
**authors:** [Ludo](https://github.com/ludoo)\
|
||||
**date:** July 13, 2023
|
||||
|
||||
## Status
|
||||
|
||||
Rejected.
|
||||
|
||||
## Context
|
||||
|
||||
The initial `gcloud` commands that grant IAM roles to the user running `apply` for the first time, are sometimes seen as an extra hurdle and an unnecessary complication.
|
||||
|
||||
These are the roles in question
|
||||
|
||||
- `roles/logging.admin`
|
||||
- `roles/owner`
|
||||
- `roles/resourcemanager.organizationAdmin`
|
||||
- `roles/resourcemanager.projectCreator`
|
||||
|
||||
One proposal we investigated was internalizing those IAM bindings in the actual Terraform code, either via bare resources or an additional organization module invocation, and depending subsequent resources on it.
|
||||
|
||||
On further investigation, this poses a few challenges
|
||||
|
||||
- the roles in question are managed authoritatively, and it would be best they remained so (e.g. to clear the Project Creator role, or ensure Organization Administrators match what is in the code)
|
||||
- project creation depends on those roles, but this creates a cycle dependency as the service accounts created are also assigned those roles, and they cannot implicitly depend (via the project) on the same roles
|
||||
|
||||
Working around this issue would require a substantial amount of hoops and a lot of development effort. It would also result in potentially less safe and more fragile code.
|
||||
|
||||
## Decision
|
||||
|
||||
What we decided is to leave those external commands in place, as the hurdle is minimal and not worth the expense and risks of removing it.
|
||||
|
||||
## Consequences
|
||||
|
||||
Nothing changes due to this decision.
|
||||
85
adrs/fast/0-cicd-plan-sa.md
Normal file
85
adrs/fast/0-cicd-plan-sa.md
Normal file
@@ -0,0 +1,85 @@
|
||||
# Add new service accounts for CI/CD with plan-only permissions
|
||||
|
||||
**authors:** [Ludo](https://github.com/ludoo) \
|
||||
**date:** December 3, 2023
|
||||
|
||||
## Status
|
||||
|
||||
In development.
|
||||
|
||||
## Context
|
||||
|
||||
The current CI/CD workflows are inherently insecure, as the same service account is used to run `terraform plan` in PR checks, and `terraform apply` in merges.
|
||||
|
||||
The current repository configuration variable allows setting a branch which could be used to only allow using the service account in merges, but that only has the consequence of preventing PR checks to work so it's not working as desired.
|
||||
|
||||
## Proposal
|
||||
|
||||
The proposal is to create a separate "chain" of less privileged service accounts that can only run `plan`, used only when a repository configuration sets a branch for merges in the `cicd_repositories` variable.
|
||||
|
||||
### Use cases
|
||||
|
||||
#### Merge branch set in repository configuration
|
||||
|
||||
```hcl
|
||||
cicd_repositories = {
|
||||
bootstrap = {
|
||||
branch = "main"
|
||||
identity_provider = "github-example"
|
||||
name = "example/bootstrap"
|
||||
type = "github"
|
||||
}
|
||||
}
|
||||
# tftest skip
|
||||
```
|
||||
|
||||
When a merge branch is set as in the example above, the CI/CD workflow will have two separate flows:
|
||||
|
||||
- for PR checks, the OIDC token will be exchanged with credentials for the `plan`-only CI/CD service account, which can only impersonate the `plan`-only automation service account
|
||||
- for merges, the current flow that enables credential exchange and impersonation of the `apply`-enabled service account will be used
|
||||
|
||||
#### No merge branch set in repository configuration
|
||||
|
||||
```hcl
|
||||
cicd_repositories = {
|
||||
bootstrap = {
|
||||
identity_provider = "github-example"
|
||||
name = "example/bootstrap"
|
||||
type = "github"
|
||||
}
|
||||
}
|
||||
# tftest skip
|
||||
```
|
||||
|
||||
If no merge branch is set in the repository configuration as in the example above, the current behaviour will be preserved allowing exchange and impersonation of the `apply`-enabled service account from any branch.
|
||||
|
||||
### Implementation
|
||||
|
||||
No changes to variables will be needed other than a lightweight refactor with `optional`.
|
||||
|
||||
The following resource changes will need to be implemented:
|
||||
|
||||
- define the set of read-only roles for each stage
|
||||
- create a new automation service account in each stage and assign the identified roles
|
||||
- create a new CI/CD service account with `roles/iam.serviceAccountTokenCreator` on the new automation service account
|
||||
- if a merge branch is set in the repository configuration
|
||||
- grant `roles/iam.workloadIdentityUser` on the new CI/CD service account to the `principalSet:` matching any branch
|
||||
- define a new provider file that impersonates the new automation service account and use it in the workflow for checks
|
||||
- keep the existing token exchange via `principal:`, impersonation and provider file for the `apply` part of the workflow only matching the specified merge branch
|
||||
- if a branch is not set the current behaviour will be kept
|
||||
|
||||
Implementation will modify in stages 0 and 1
|
||||
|
||||
- the `automation.tf` files
|
||||
- any file where IAM roles are assigned to the automation service account
|
||||
- the `cicd-*.tf` files
|
||||
- the `templates/workflow-*.yaml` files to implement the new workflow logic
|
||||
- the `outputs.tf` files to generate the additional provider files
|
||||
|
||||
## Decision
|
||||
|
||||
This has been surfaced a while ago and implementation was only pending actual time for development. Development has started.
|
||||
|
||||
## Consequences
|
||||
|
||||
Existing CI/CD workflows will need to be replaced when a merge branch is already defined in the repository configuration (unlikely to happen as the current workflow would not work).
|
||||
142
adrs/fast/0-domainless-iam.md
Normal file
142
adrs/fast/0-domainless-iam.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# Support for domain-less organizations
|
||||
|
||||
**authors:** [Ludo](https://github.com/ludoo) \
|
||||
**reviewed by:** [Julio](https://github.com/juliocc) \
|
||||
**date:** Feb 12, 2024
|
||||
|
||||
## Status
|
||||
|
||||
Implemented in [#2064](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/pull/2064).
|
||||
|
||||
## Context
|
||||
|
||||
The current FAST design assumes that operational groups come from the same Cloud Identity instance connected to the GCP organization.
|
||||
|
||||
While this approach has worked well in the past, there are already designs that cannot be easily mapped (for example groups coming from a separate CI), and the situation will only get worse once domain-less organizations start to be in wider use.
|
||||
|
||||
Removing the assumption that FAST logical principals (e.g. `gcp-organization-admins`) always map directly to groups is not entirely trivial, since FAST uses data from the `groups` variable in different places:
|
||||
|
||||
- to define authoritative IAM bindings via the module-level `group_iam` interface
|
||||
- to define additive IAM bindings via the module-level `iam_bindings_additive` interface
|
||||
- to set essential contacts at the folder and project level
|
||||
|
||||
This proposal removes the dependency from groups by allowing to pass in to FAST any principal type, while still trying to preserve the current default behaviour and code readability in IAM bindings.
|
||||
|
||||
## Proposal
|
||||
|
||||
### FAST variable type change and optional interpolation
|
||||
|
||||
The current `groups` variable was meant as a simple mapping between logical profile names used internally by FAST, and actual group names. The default case was furthermore made easier by interpolating the organization domain when no domain was specified, and adding the `group:` principal prefix for IAM bindings.
|
||||
|
||||
The new proposed variable maintains the legacy behaviour, but slightly changes it so that no interpolation happens if the variable attributes have a principal prefix. The variable type is also updated to use `optional`, so that individual logical profile / principal mappings can be specified without having to override the whole block.
|
||||
|
||||
```hcl
|
||||
variable "groups" {
|
||||
type = object({
|
||||
gcp-billing-admins = optional(string, "gcp-billing-admins")
|
||||
gcp-devops = optional(string, "gcp-devops")
|
||||
gcp-network-admins = optional(string, "gcp-network-admins")
|
||||
gcp-organization-admins = optional(string, "gcp-organization-admins")
|
||||
gcp-security-admins = optional(string, "gcp-security-admins")
|
||||
gcp-support = optional(string, "gcp-support")
|
||||
})
|
||||
nullable = false
|
||||
default = {}
|
||||
}
|
||||
```
|
||||
|
||||
Passing in different principals is intuitive:
|
||||
|
||||
```hcl
|
||||
groups = {
|
||||
gcp-devops = "principalSet://iam.googleapis.com/locations/global/workforcePools/mypool/group/abc123"
|
||||
gcp-organization-admins = "group:gcp-organization-admins@other.domain"
|
||||
}
|
||||
```
|
||||
|
||||
Internally, interpolation is fairly straightforward:
|
||||
|
||||
```hcl
|
||||
locals {
|
||||
groups = {
|
||||
for k, v in var.group_principals : k => (
|
||||
can(regex("^[a-zA-Z]+:", v))
|
||||
? v
|
||||
: "group:${v}@${var.organization.domain}"
|
||||
)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### FAST IAM additive bindings and module interface change
|
||||
|
||||
FAST leverages the `group_iam` module-level interface to improve code readability for authoritative bindings, which is a primary goal of the framework. Introducing support for any principal type prevents us from using this interface, with a non-trivial impact on the overall readability of IAM roles in FAST.
|
||||
|
||||
This is an example use in the IaC project:
|
||||
|
||||
```hcl
|
||||
# human (groups) IAM bindings
|
||||
group_iam = {
|
||||
(local.groups.gcp-devops) = [
|
||||
"roles/iam.serviceAccountAdmin",
|
||||
"roles/iam.serviceAccountTokenCreator",
|
||||
]
|
||||
(local.groups.gcp-organization-admins) = [
|
||||
"roles/iam.serviceAccountTokenCreator",
|
||||
"roles/iam.workloadIdentityPoolAdmin"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
This proposal addresses the issue by changing the module-level interface to support different principal types. The original goal for `group_iam` -- to allow for better readability -- is preserved at the cost of the slight increase in verbosity due to having to specify the principal type.
|
||||
|
||||
The trade-off in verbosity seems acceptable as it makes the new interface more flexible, and allows using the interface for `principal:` and `principalSet:` types, which are becoming more and more important to support.
|
||||
|
||||
FAST code remains unchanged, as the `groups` local already contains a prefix for each principal, either interpolated or passed in by the user.
|
||||
|
||||
The module-level variable definition changes only its name and description:
|
||||
|
||||
```hcl
|
||||
variable "iam_by_principals" {
|
||||
description = "Authoritative IAM binding in {PRINCIPAL => [ROLES]} format. Principals need to be statically defined to avoid cycle errors. Merged internally with the `iam` variable."
|
||||
type = map(list(string))
|
||||
default = {}
|
||||
nullable = false
|
||||
}
|
||||
```
|
||||
|
||||
Actual use is basically unchanged from the current `group_iam` interface:
|
||||
|
||||
```hcl
|
||||
# current interface
|
||||
group_iam = {
|
||||
"app1-admins@example.org" = [
|
||||
"roles/owner",
|
||||
"roles/resourcemanager.folderAdmin",
|
||||
"roles/resourcemanager.projectCreator"
|
||||
]
|
||||
}
|
||||
# proposed interface
|
||||
iam_by_principals = {
|
||||
"group:app1-admins@example.org" = [
|
||||
"roles/owner",
|
||||
"roles/resourcemanager.folderAdmin",
|
||||
"roles/resourcemanager.projectCreator"
|
||||
]
|
||||
"principalSet://iam.googleapis.com/locations/global/workforcePools/mypool/group/abc123": = [
|
||||
"roles/owner",
|
||||
"roles/resourcemanager.folderAdmin",
|
||||
"roles/resourcemanager.projectCreator"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### FAST essential contacts
|
||||
|
||||
Having `group_principals` support different type of principals will make it impossible to use the same variable to set essential contacts, as the principal might not be a group.
|
||||
|
||||
This will require introduction of a new `essential_contacts` top-level variable keyed by folder/project (the individual contexts on which to set contacts), with the added benefit of being able to specify different and potentially multiple contacts compared to now.
|
||||
|
||||
## Decision
|
||||
|
||||
Rolled out.
|
||||
54
adrs/fast/0-org-policies.md
Normal file
54
adrs/fast/0-org-policies.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# Move organization policies to bootstrap stage
|
||||
|
||||
**authors:** [Julio](https://github.com/juliocc), [Ludo](https://github.com/ludoo), [Roberto](https://github.com/drebes) \
|
||||
**date:** September 13, 2023
|
||||
|
||||
## Status
|
||||
|
||||
Implemented.
|
||||
|
||||
## Context
|
||||
|
||||
Three different requirements drive this proposal.
|
||||
|
||||
### Organization policies deployed at bootstrap time
|
||||
|
||||
Many organizations take security seriously, and would like to have organization policies (for example `iam.automaticIamGrantsForDefaultServiceAccounts`) deployed right from the beginning at bootstrap time. This is currently extremely cumbersome, as organization policies are managed in stage 1.
|
||||
|
||||
As an additional benefit, managing some or all organization policies in stage 0 will enable to turn off undesired resource configuration for the initial projects (for example `compute.skipDefaultNetworkCreation`).
|
||||
|
||||
### Simplify and limit delegation of Organization Policy Administrator role
|
||||
|
||||
Automation service accounts are currently assigned the Organization Policy Administrator role at the organization level, scoped via resource management tags. This is cumbersome as bindings are distributed between stage 0 that delegates role control to the stage 1 service account, and stage 1 that creates the automation service accounts, tags and folder bindings used for scoping.
|
||||
|
||||
A more secure way of doing this is via a dedicated resource management tag value hierarchy, and conditions on the organization policies that alter behaviour based on tags. This would allow centrally defining allowed exceptions to organization policies, and selectively granting access to specific exceptions to individual automation service accounts via tag values.
|
||||
|
||||
The project factory will need to retain scoped grants, to set policies that enforce lists of resources which would be too cumbersome to maintain in stage 0.
|
||||
|
||||
### Reduce stage 1 complexity to allow simpler creation of hierarchy templates
|
||||
|
||||
Stage 1 is currently too complex to allow easy cloning into different resource hierarchy templates, which are needed to account for all landing zone designs.
|
||||
|
||||
Removing complexity from stage 1 by moving organization policy and its related IAM to stage 0 will be an initial step towards stage 1 simplification.
|
||||
|
||||
## Proposal
|
||||
|
||||
The proposal is to
|
||||
|
||||
- move management of organization policies to stage 0
|
||||
- move management of the `org_policies` tag key and associated values to stage 0
|
||||
- remove delegated/conditional grants for the Organization Policy Administrator role from stage 0 and 1
|
||||
|
||||
The approach fattens stage 0 and lessens its decoupling role in the overall FAST design, but looks preferable compared to the complexity of splitting organization policy management between stage 0 and 1, or worse delegating control of specific policies to external commands run before stage 0.
|
||||
|
||||
## Decision
|
||||
|
||||
Decision is to implement this.
|
||||
|
||||
## Consequences
|
||||
|
||||
Organization policies and related tags will need to be moved from stage 1 to stage 0 state. One approach is to
|
||||
|
||||
- switch both states to local state
|
||||
- use `terraform state mv -state-out` to temporarily move resources from stage 1 to stage 0
|
||||
- push stage 0 and stage 1 state
|
||||
41
adrs/fast/1-network-ranges.md
Normal file
41
adrs/fast/1-network-ranges.md
Normal file
@@ -0,0 +1,41 @@
|
||||
# IP ranges for network stages
|
||||
|
||||
**authors:** [Ludo](https://github.com/ludoo), [Roberto](https://github.com/drebes), [Julio](https://github.com/jccb) \
|
||||
**date:** Sept 20, 2023
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Context
|
||||
|
||||
Adding or changing subnets to networking stages is a mistake-prone process because there is no clear IP plan. The problem was made worse when we began supporting GKE, which requires secondary ranges and a large number of IP addresses for pods and services.
|
||||
|
||||
This was not an issue when there were only a few networking stages, but as FAST expands, it becomes more difficult to keep track of IP ranges for different regions and environments.
|
||||
|
||||
## Decision
|
||||
|
||||
We adopted an IP plan based on regions and environments with the following key points:
|
||||
|
||||
- Large ranges for the 3 environments we have out of the box (landing, dev, prod)
|
||||
- Support for 2 regions
|
||||
- Leave enough space to easily grow either the number of environments or regions
|
||||
- Allocate large blocks from the CG-NAT range to use as secondary ranges, primarily for GKE pods and services.
|
||||
|
||||
The following table summarizes the agreed IP plan:
|
||||
|
||||
| | aggregate | landing | dev | prod |
|
||||
|----------------------------|--------------:|-------------------------------------------------------------------:|--------------:|--------------:|
|
||||
| Region 1, primary ranges | 10.64.0.0/12 | 10.64.0.0/16<br>Trusted: 10.64.0.0/17<br>Untrusted: 10.64.128.0/17 | 10.68.0.0/16 | 10.72.0.0/16 |
|
||||
| Region 2, primary ranges | 10.80.0.0/12 | 10.80.0.0/16<br>Trusted: 10.80.0.0/17<br>Untrusted: 10.80.128.0/17 | 10.84.0.0/16 | 10.88.0.0/16 |
|
||||
| Region 1, secondary ranges | 100.64.0.0/12 | 100.64.0.0/14 | 100.68.0.0/14 | 100.72.0.0/14 |
|
||||
| Region 2, secondary ranges | 100.80.0.0/12 | 100.80.0.0/14 | 100.84.0.0/16 | 100.88.0.0/14 |
|
||||
|
||||
To allocate additional secondary ranges for GKE clusters:
|
||||
|
||||
- For the pods range, use the next available /16 in the secondary range of its region/environment pair.
|
||||
- For the service range, use the next available /24 in the last /16 of its region/environment pair.
|
||||
|
||||
## Consequences
|
||||
|
||||
Default subnets for networking stages were updated to reflect the new ranges.
|
||||
3
adrs/fast/README.md
Normal file
3
adrs/fast/README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# FAST architectural documents
|
||||
|
||||
This folder contains assorted bits of documentation used to log current architectural choices, or past decisions. Format is inspired by [Michael Nygard's decision record template](https://github.com/joelparkerhenderson/architecture-decision-record/blob/main/templates/decision-record-template-by-michael-nygard/index.md).
|
||||
388
adrs/modules/20230816-iam-refactor.md
Normal file
388
adrs/modules/20230816-iam-refactor.md
Normal file
@@ -0,0 +1,388 @@
|
||||
# Refactor IAM interface
|
||||
|
||||
**authors:** [Ludo](https://github.com/ludoo), [Julio](https://github.com/juliocc)
|
||||
**last modified:** February 12, 2024
|
||||
|
||||
## Status
|
||||
|
||||
- Implemented in [#1595](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/pull/1595).
|
||||
- Authoritative bindings type changed as per [#1622](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/issues/1622).
|
||||
- Extended by [#2064](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/issues/2064).
|
||||
|
||||
## Context
|
||||
|
||||
The IAM interface in our modules has evolved organically to progressively support more functionality, resulting in a large variable surface, lack of support for some key features like conditions, and some fragility for specific use cases.
|
||||
|
||||
We currently support, with uneven coverage across modules:
|
||||
|
||||
- authoritative `iam` in `ROLE => [PRINCIPALS]` format
|
||||
- authoritative `group_iam` in `GROUP => [ROLES]` format
|
||||
- legacy additive `iam_additive` in `ROLE => [PRINCIPALS]` format which breaks for dynamic values
|
||||
- legacy additive `iam_additive_members` in `PRINCIPAL => [ROLES]` format which breaks for dynamic values
|
||||
- new additive `iam_members` in `KEY => {role: ROLE, member: MEMBER, condition: CONDITION}` format which works with dynamic values and supports conditions
|
||||
- policy authoritative `iam_policy`
|
||||
- specific support for third party resource bindings in the service account module
|
||||
|
||||
## Proposal
|
||||
|
||||
### Authoritative bindings
|
||||
|
||||
These tend to work well in practice, and the current `iam` and `group_iam` variables are simple to use with good coverage across modules.
|
||||
|
||||
The only small use case that they do not cover is IAM conditions, which are easy to implement but would render the interface more verbose for the majority of cases where conditions are not needed.
|
||||
|
||||
The **proposal** for authoritative bindings is to
|
||||
|
||||
- leave the current interface in place (`iam` and `group_iam`)
|
||||
- expand coverage so that all modules who have iam resources expose both
|
||||
- add a new `iam_bindings` variable to support authoritative IAM with conditions
|
||||
|
||||
The new `iam_bindings` variable will look like this:
|
||||
|
||||
```hcl
|
||||
variable "iam_bindings" {
|
||||
description = "Authoritative IAM bindings in {KEY => {role = ROLE, members = [], condition = {}}}. Keys are arbitrary."
|
||||
type = map(object({
|
||||
members = list(string)
|
||||
role = string
|
||||
condition = optional(object({
|
||||
expression = string
|
||||
title = string
|
||||
description = optional(string)
|
||||
}))
|
||||
}))
|
||||
nullable = false
|
||||
default = {}
|
||||
}
|
||||
```
|
||||
|
||||
This variable will not be internally merged in modules with `iam` or `group_iam`.
|
||||
|
||||
### Additive bindings
|
||||
|
||||
Additive bindings have evolved to mimic authoritative ones, but the result is an interface which is bloated (no one uses `iam_additive_members`), and hard to understand and use without triggering dynamic errors. Coverage is also spotty and uneven across modules, and the interface needs to support aliasing of project service accounts in the project module to work around dynamic errors.
|
||||
|
||||
The `iam_additive` variable is used in a special patterns in data blueprints, to allow code to not mess up existing IAM bindings in an external project on destroy. This pattern only works in a limited set of cases, where principals are passed in via static variables or refer to "magic" static outputs in our modules. This is a simple example of the pattern:
|
||||
|
||||
```hcl
|
||||
locals {
|
||||
iam = {
|
||||
"roles/viewer" = [
|
||||
module.sa.iam_email,
|
||||
var.group.admins
|
||||
]
|
||||
}
|
||||
}
|
||||
module "project" {
|
||||
iam = (
|
||||
var.project_create == null ? {} : local.iam
|
||||
)
|
||||
iam_additive = (
|
||||
var.project_create != null ? {} : local.iam
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
The **proposal** for authoritative bindings is to
|
||||
|
||||
- remove `iam_additive` and `iam_additive_members` from the interface
|
||||
- add a new `iam_bindings_additive` variable
|
||||
|
||||
Once new variables are in place, migrate existing blueprints to using `iam_bindings_additive` using one of the two available patterns:
|
||||
|
||||
- the flat verbose one where bindings are declared in the module call
|
||||
- the more complex one that moves roles out to `locals` and uses them in `for` loops
|
||||
|
||||
The new variable will closely follow the type of the authoritative `iam_bindings` variable described above:
|
||||
|
||||
```hcl
|
||||
variable "iam_bindings_additive" {
|
||||
description = "Additive IAM bindings with support for conditions, in {KEY => { role = ROLE, members = [], condition = {}}} format."
|
||||
type = map(object({
|
||||
member = string
|
||||
role = string
|
||||
condition = optional(object({
|
||||
expression = string
|
||||
title = string
|
||||
description = optional(string)
|
||||
}))
|
||||
}))
|
||||
}
|
||||
```
|
||||
|
||||
### IAM policy
|
||||
|
||||
The **proposal** is to remove the IAM policy variable and resources, as its coverage is very uneven and we never used it in practice. This will also simplify data access log management, which is currently split between its own variable/resource and the IAM policy ones.
|
||||
|
||||
### IAM by Principals
|
||||
> [!NOTE]
|
||||
> This section was added on 2024-02-12
|
||||
|
||||
[#2064](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/issues/2064). replaced `group_iam` with `iam_by_principals`. The structure of `iam_by_principals` is similar to the original `group_iam` with the difference that now the user has to specify the principal type with the correct prefix. The new variable format is shown below
|
||||
|
||||
```hcl
|
||||
variable "iam_by_principals" {
|
||||
description = "Authoritative IAM binding in {PRINCIPAL => [ROLES]} format. Principals need to be statically defined to avoid cycle errors. Merged internally with the `iam` variable."
|
||||
type = map(list(string))
|
||||
default = {}
|
||||
nullable = false
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
See #2064 and [this ADR](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/ludo/iam-changes/fast/docs/0-domainless-iam.md) for more details.
|
||||
|
||||
|
||||
## Decision
|
||||
|
||||
The proposal above summarizes the state of discussions between the authors, and implementation will be tested.
|
||||
|
||||
|
||||
## Consequences
|
||||
|
||||
### FAST
|
||||
|
||||
IAM implementation in the bootstrap stage and matching multitenant bootstrap has radically changed, with the addition of a new [`organization-iam.tf`](https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/master/fast/stages/0-bootstrap/organization-iam.tf) file which contains IAM binding definitions in an abstracted format, that is then converted to the specific formats required by the `iam`, `iam_bindings` and `iam_bindings_additive` variables.
|
||||
|
||||
This brings several advantages over the previous handling of IAM:
|
||||
|
||||
- authoritative and additive bindings are now grouped by principal in an easy to read and change format that serves as its own documentation
|
||||
- support for IAM conditions has removed the need for standalone resources and made the intent behind those more explicit
|
||||
- some subtle bugs on the intersection of user-specified bindings and internally-specified ones have been addressed
|
||||
|
||||
### Blueprints
|
||||
|
||||
A few data blueprints that leverage `iam_additive` have been refactored to use the new variable. This is most notable in data blueprints, where extra files have been added to the more complex examples like data foundations, to abstract IAM bindings in a way similar to what is described above for FAST.
|
||||
|
||||
## Implementation
|
||||
|
||||
The following sections provide a template for IAM-related variables and resources to ensure a consistent implementation of IAM across the repository. Use these code snippets to add IAM support to your module.
|
||||
|
||||
### Top-level module IAM
|
||||
|
||||
Use this template if your module manages a single instance of a given resource (e.g. a KMS keyring).
|
||||
|
||||
```terraform
|
||||
# variables.tf
|
||||
|
||||
variable "iam" {
|
||||
description = "IAM bindings in {ROLE => [MEMBERS]} format. Mutually exclusive with the access_* variables used for basic roles."
|
||||
type = map(list(string))
|
||||
default = {}
|
||||
nullable = false
|
||||
}
|
||||
|
||||
variable "iam_bindings" {
|
||||
description = "Authoritative IAM bindings in {KEY => {role = ROLE, members = [], condition = {}}}. Keys are arbitrary."
|
||||
type = map(object({
|
||||
members = list(string)
|
||||
role = string
|
||||
condition = optional(object({
|
||||
expression = string
|
||||
title = string
|
||||
description = optional(string)
|
||||
}))
|
||||
}))
|
||||
default = {}
|
||||
nullable = false
|
||||
}
|
||||
|
||||
variable "iam_bindings_additive" {
|
||||
description = "Keyring individual additive IAM bindings. Keys are arbitrary."
|
||||
type = map(object({
|
||||
member = string
|
||||
role = string
|
||||
condition = optional(object({
|
||||
expression = string
|
||||
title = string
|
||||
description = optional(string)
|
||||
}))
|
||||
}))
|
||||
default = {}
|
||||
nullable = false
|
||||
}
|
||||
|
||||
variable "iam_by_principals" {
|
||||
description = "Authoritative IAM binding in {PRINCIPAL => [ROLES]} format. Principals need to be statically defined to avoid cycle errors. Merged internally with the `iam` variable."
|
||||
type = map(list(string))
|
||||
default = {}
|
||||
nullable = false
|
||||
}
|
||||
```
|
||||
|
||||
```terraform
|
||||
# iam.tf
|
||||
|
||||
locals {
|
||||
_iam_principal_roles = distinct(flatten(values(var.iam_by_principals)))
|
||||
_iam_principals = {
|
||||
for r in local._iam_principal_roles : r => [
|
||||
for k, v in var.iam_by_principals :
|
||||
k if try(index(v, r), null) != null
|
||||
]
|
||||
}
|
||||
iam = {
|
||||
for role in distinct(concat(keys(var.iam), keys(local._iam_principals))) :
|
||||
role => concat(
|
||||
try(var.iam[role], []),
|
||||
try(local._iam_principals[role], [])
|
||||
)
|
||||
}
|
||||
}
|
||||
resource "google_RESOURCE_TYPE_iam_binding" "authoritative" {
|
||||
for_each = local.iam
|
||||
role = each.key
|
||||
members = each.value
|
||||
// add extra attributes (e.g. resource id)
|
||||
}
|
||||
|
||||
resource "google_RESOURCE_TYPE_iam_binding" "bindings" {
|
||||
for_each = var.iam_bindings
|
||||
role = each.value.role
|
||||
members = each.value.members
|
||||
// add extra attributes (e.g. resource id)
|
||||
|
||||
dynamic "condition" {
|
||||
for_each = each.value.condition == null ? [] : [""]
|
||||
content {
|
||||
expression = each.value.condition.expression
|
||||
title = each.value.condition.title
|
||||
description = each.value.condition.description
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
resource "google_RESOURCE_TYPE_iam_member" "bindings" {
|
||||
for_each = var.iam_bindings_additive
|
||||
role = each.value.role
|
||||
member = each.value.member
|
||||
// add extra attributes (e.g. resource id)
|
||||
|
||||
dynamic "condition" {
|
||||
for_each = each.value.condition == null ? [] : [""]
|
||||
content {
|
||||
expression = each.value.condition.expression
|
||||
title = each.value.condition.title
|
||||
description = each.value.condition.description
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Sub-resources IAM
|
||||
|
||||
Use this template if your module manages multiple instances of a resource (e.g. keys in KMS keyring).
|
||||
|
||||
```terraform
|
||||
# variables.tf
|
||||
variable "sub_resources" {
|
||||
type = map(object({
|
||||
# sub-resource configuration here
|
||||
|
||||
iam = optional(map(list(string)), {})
|
||||
iam_bindings = optional(map(object({
|
||||
members = list(string)
|
||||
condition = optional(object({
|
||||
expression = string
|
||||
title = string
|
||||
description = optional(string)
|
||||
}))
|
||||
})), {})
|
||||
iam_bindings_additive = optional(map(object({
|
||||
member = string
|
||||
role = string
|
||||
condition = optional(object({
|
||||
expression = string
|
||||
title = string
|
||||
description = optional(string)
|
||||
}))
|
||||
})), {})
|
||||
}))
|
||||
default = {}
|
||||
nullable = false
|
||||
}
|
||||
```
|
||||
|
||||
```terraform
|
||||
# iam.tf
|
||||
locals {
|
||||
SUB_RESOURCE_iam = flatten([
|
||||
for k, v in var.SUB_RESOURCEs : [
|
||||
for role, members in v.iam : {
|
||||
SUB_RESOURCE = k
|
||||
role = role
|
||||
members = members
|
||||
}
|
||||
]
|
||||
])
|
||||
SUB_RESOURCE_iam_bindings = merge([
|
||||
for k, v in var.SUB_RESOURCEs : {
|
||||
for binding_key, data in v.iam_bindings :
|
||||
binding_key => {
|
||||
SUB_RESOURCE = k
|
||||
role = data.role
|
||||
members = data.members
|
||||
condition = data.condition
|
||||
}
|
||||
}
|
||||
]...)
|
||||
SUB_RESOURCE_iam_bindings_additive = merge([
|
||||
for k, v in var.SUB_RESOURCEs : {
|
||||
for binding_key, data in v.iam_bindings_additive :
|
||||
binding_key => {
|
||||
SUB_RESOURCE = k
|
||||
role = data.role
|
||||
member = data.member
|
||||
condition = data.condition
|
||||
}
|
||||
}
|
||||
]...)
|
||||
}
|
||||
```
|
||||
|
||||
```terraform
|
||||
# iam.tf
|
||||
|
||||
resource "google_SUB_RESOURCE_iam_binding" "authoritative" {
|
||||
for_each = {
|
||||
for binding in local.SUB_RESOURCE_iam :
|
||||
"${binding.key}.${binding.role}" => binding
|
||||
}
|
||||
role = each.value.role
|
||||
members = each.value.members
|
||||
// add extra attributes (e.g. sub resource id)
|
||||
}
|
||||
|
||||
resource "google_SUB_RESOURCE_iam_binding" "bindings" {
|
||||
for_each = local.SUB_RESOURCE_iam_bindings
|
||||
role = each.value.role
|
||||
members = each.value.members
|
||||
// add extra attributes (e.g. sub resource id)
|
||||
|
||||
dynamic "condition" {
|
||||
for_each = each.value.condition == null ? [] : [""]
|
||||
content {
|
||||
expression = each.value.condition.expression
|
||||
title = each.value.condition.title
|
||||
description = each.value.condition.description
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
resource "google_SUB_RESOURCE_iam_member" "members" {
|
||||
for_each = local.SUB_RESOURCE_iam_bindings_additive
|
||||
role = each.value.role
|
||||
member = each.value.member
|
||||
// add extra attributes (e.g. sub resource id)
|
||||
|
||||
dynamic "condition" {
|
||||
for_each = each.value.condition == null ? [] : [""]
|
||||
content {
|
||||
expression = each.value.condition.expression
|
||||
title = each.value.condition.title
|
||||
description = each.value.condition.description
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
102
adrs/modules/20231106-factories.md
Normal file
102
adrs/modules/20231106-factories.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# Factories Refactor and Plan Forward
|
||||
|
||||
**authors:** [Ludo](https://github.com/ludoo)
|
||||
**last modified:** February 16, 2024
|
||||
|
||||
## Status
|
||||
|
||||
Under discussion.
|
||||
|
||||
## Context
|
||||
|
||||
Factories evolved progressively in Fabric, from the original firewall factory module, to a semi-standardized approach to management of repeated resources. This progression happened piecemeal and it's now time to define a clear strategy for factories in both Fabric and FAST, so that we can remove guesswork from new developments and provide a predictive approach to users.
|
||||
|
||||
The remainder of this section provides a summary of the current status.
|
||||
|
||||
### Modules
|
||||
|
||||
Several modules implement factories for repeated resources which are typically dependent from the main resource managed in the module:
|
||||
|
||||
- `billing-account` provides a factory for billing alert rules tied to the billing account
|
||||
- `dns-response-policy` provides a factory for rules in within the policy
|
||||
- `net-firewall-policy` provides a factory for rules within the policy
|
||||
- `net-vpc` provides a factory for subnets in the VPC
|
||||
- `net-vpc-firewall` provides a factory for VPC firewall rules
|
||||
- `organization` and `folder` provide a factory for hierarchical firewall rules within their policy
|
||||
- `organization`, `folder` and `project` provide a factory for organization policies
|
||||
|
||||
The common pattern for modules is management of *multiple resources* typically dependent from the single *main resource* managed by the module.
|
||||
|
||||
### Blueprints
|
||||
|
||||
The `factories` folder in blueprints contains a collection of factories with a fuzzier approach
|
||||
|
||||
- `bigquery-factory` manages tables and views for 1-n datasets by wrapping the `bigquery-dataset` module via simple locals
|
||||
- `cloud-identity-group-factory` manages Cloud Identity group members for 1-n groups by wrapping the `cloud-identity-group` via simple locals
|
||||
- `net-vpc-firewall-yaml` is the original factory module managing VPC firewall rules, superseded by the factory in the `net-vpc-firewall` module
|
||||
- `project-factory` combines the project, service account, and (planned) billing account and VPC modules to implement end-to-end project creation and configuration
|
||||
|
||||
There's no clear common pattern for these factories, where some could be moved to the respective module and the project factory combines a collection of modules to implement a process.
|
||||
|
||||
### FAST
|
||||
|
||||
FAST currently leverages module-level factories (organization policies, subnets, firewalls, etc.), and also provides the project factory as a dedicated level 3 stage by wrapping the relevant blueprint and localizing a few variables for the environment (`prefix`, `labels`).
|
||||
|
||||
## Proposal
|
||||
|
||||
While the current approach is reasonably clear in regards to modules, it has never been formalized in a set of guidelines that can help authors define when and how new factories would made sense.
|
||||
|
||||
On top of this, the `factories` blueprints folder contains code that that should really be moved to module-level factories, and the project factory which could/should be published directly as a FAST stage, since those are consumable as standalone modules.
|
||||
|
||||
This proposal aims at addressing the above problems.
|
||||
|
||||
### Module-level factory approach
|
||||
|
||||
The current approach for module-level factories can be summarized in a single principle:
|
||||
|
||||
> factories implemented in modules manage multiple resources which depend from one single main resource (or a small set of main resources) which are the main driver of the module.
|
||||
|
||||
For example, the module managing a firewall policy exposes a factory for its rules, or the module managing a VPC exposes a factory for its subnets. But the project module would not expose a projects factory, as one project maps to a single module invocation.
|
||||
|
||||
The proposal on factory modules then is to:
|
||||
|
||||
- align all factory variables to the same standard, outlined below
|
||||
- move the groups and bigquery factories from blueprints to the respective modules
|
||||
- eventually add more factories when it makes sense to do so (e.g. for KMS keys, service accounts, etc.)
|
||||
|
||||
The variable interface for module-level factories should use a single top-level `factory_configs` variable, whose type is an object with one or more attributes which are named according to the specific factory. This will allow composing multiple factory configurations into a single variable in FAST stages, by avoiding name overlaps. An example:
|
||||
|
||||
```hcl
|
||||
variable "factory_configs" {
|
||||
description = "Path to folder containing budget alerts data files."
|
||||
type = object({
|
||||
budgets_data_path = optional(string, "data/billing-budgets")
|
||||
})
|
||||
nullable = false
|
||||
default = {}
|
||||
}
|
||||
```
|
||||
|
||||
### Blueprint factories
|
||||
|
||||
The `factories` folder in blueprints will be emptied, and a single README left in it pointing to all the module-level and FAST stage factories available.
|
||||
|
||||
As outlined above, the existing factories will be moved to modules (bigquery and groups), FAST (project factory), or deleted (firewall rules).
|
||||
|
||||
### FAST factories
|
||||
|
||||
The only change for FAST factories will be moving the project factory from blueprints to the stage folder, and updating the path used for the environment-level wrapping stage.
|
||||
|
||||
### File schema and filesystem organization
|
||||
|
||||
Factory files schema must mimic and implement the variable interface for the module, including optionals and validation - which are implemented in code and checks.
|
||||
|
||||
With notable exceptions (currently only the `cidrs.yaml` file consumed by firewall factories), the following convention for files/directory is proposed:
|
||||
|
||||
- Factories should consume directories (vs single files)
|
||||
- All files should contain a dictionary of resources or a single resource
|
||||
- If the factory accepts one resource per file (e.g. VPC subnets), the file name should be used for the resource name and the YAML should allow defining a `name:` override
|
||||
- Files in a directory should be parsed together and flattened into a single dictionary
|
||||
|
||||
This allows developers to implement multiple resources in a single file or to use one file per resource, as they see fit.
|
||||
|
||||
3
adrs/modules/README.md
Normal file
3
adrs/modules/README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# Fabric modules architectural documents
|
||||
|
||||
This folder contains assorted bits of documentation used to log current architectural choices, or past decisions. Format is inspired by [Michael Nygard's decision record template](https://github.com/joelparkerhenderson/architecture-decision-record/blob/main/templates/decision-record-template-by-michael-nygard/index.md).
|
||||
Reference in New Issue
Block a user