Add Data Platform to FAST (#510)

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* merge tools changes

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* add bolierplate to validate_schema

Co-authored-by: Julio Castillo <juliocc@users.noreply.github.com>

* stage 02-security

* Import Fast from dev repository.

Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* Copy FAST top level README

* Copy FAST top level README

* TODO list

* TODO list

* fix linting action to account for fast

* remove providers file

* add missing boilerplate

* update factory README

* align examples tfdoc

* fast readmes tfdoc

* disable markdown link check

* really disable markdown link check

* update TODO

* switch to local module refs in stage0

* replace module refs in 02-sec

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* merge tools changes

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* add bolierplate to validate_schema

Co-authored-by: Julio Castillo <juliocc@users.noreply.github.com>

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* stage 02-security

* Import Fast from dev repository.

Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* Copy FAST top level README

* Copy FAST top level README

* TODO list

* TODO list

* fix linting action to account for fast

* remove providers file

* add missing boilerplate

* update factory README

* align examples tfdoc

* fast readmes tfdoc

* disable markdown link check

* really disable markdown link check

* update TODO

* switch to local module refs in stage0

* replace module refs in 02-sec

* Move first draft to fast branch

* Fix roles and variables. Add e2e DAG example!

* Fix example

* Fix KMS

* First draft: README

* Update README

* Add DLP, update README

* Update Readme

* README

* Add todos

* Merge master

* Merge master

* Merge master

* Fix and test KMS, Fix and test existing prj (it works also with single prj), Update README

* Fix READM and Demo

* add  on TF files

* Remove block comments

* simplify service_encryption_keys logic

* fix README

* Fix TODOs

* fix tfdoc description

* fix demo README

* fix sample files

* rename tf files

* Fix outputs file name, fix README, remove dependeces on composer resource

* Add test.

* Fix README.

* Initial README update

* README review

* Fix issues & readme

* Fix README

* Fix README

* Fix test error

* Fix test error

* Add datacatalog

* Fix test, for real? :-)

* fix readme

* support policy_boolean

* split Cloud NAT flag

* Fix README.

* Fix Shared VPC, first try :-)

* Fix tests and resource name

* fix tests

* fix tests

* README refactor

* Fix secondary range logic

* First commit

* Replace existing data platform

* Fix secondary range logic

* Fix README

* Replace DP example tests with the new one.

* Fix test module location.

* Fix test module location, for real.

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* Add PubSub encryption

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* Update READMEs

* Update README

* Fix composer roles and README.

* Fix test.

* Fixes.

* Add DLP documentation link.

* Temp commit with errors

* Refactor variables

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* rebase

* rebase

* rebase

* Rebase

* rebase

* Update READMEs

* Fixes.

* Fix new variables

* Fix misconfiguration and tests.

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* rebase

* rebase

* rebase

* Rebase

* rebase

* Update READMEs

* Fixes.

* Rebase - Fix secondary range logic

* Rebase - First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Initial fix for KMS

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Fix module path

* Initial fix for KMS

* Update READMEs

* Fixes.

* Fix new variables

* Revert VPC-SC logic

* Fix variable typos

* README fixes

* Fix Project Name logic

* Fix Linting

* READEME

* update READEME

* update READEME

* update README

* mandatory project creation, refactor

* formatting

* add TODO for service accounts descriptive name

* use project module to assign shared vpc roles

* Fix shared-vpc-project module

* Fix vpc name and tests

* README

* update to newer version

Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>
Co-authored-by: Julio Castillo <juliocc@users.noreply.github.com>
Co-authored-by: Julio Castillo <jccb@google.com>
This commit is contained in:
lcaggio
2022-02-11 17:32:16 +01:00
committed by GitHub
parent 9076c2f2b0
commit bf64a3dfda
73 changed files with 2732 additions and 1288 deletions

View File

@@ -195,6 +195,7 @@ resource "google_organization_iam_binding" "org_admin_delegated" {
"roles/compute.orgFirewallPolicyAdmin",
"roles/compute.xpnAdmin",
"roles/orgpolicy.policyAdmin",
module.organization.custom_role_id.serviceProjectNetworkAdmin
],
local.billing_org ? [
"roles/billing.admin",

View File

@@ -67,6 +67,16 @@ locals {
billing_account_id = var.billing_account.id
prefix = var.prefix
})
"03-data-platform-dev" = jsonencode({
billing_account_id = var.billing_account.id
organization = var.organization
prefix = var.prefix
})
"03-data-platform-prod" = jsonencode({
billing_account_id = var.billing_account.id
organization = var.organization
prefix = var.prefix
})
}
}

View File

@@ -20,6 +20,8 @@ locals {
# used here for convenience, in organization.tf members are explicit
billing_ext_users = concat(
[
module.branch-dp-dev-sa.iam_email,
module.branch-dp-prod-sa.iam_email,
module.branch-network-sa.iam_email,
module.branch-security-sa.iam_email,
],

View File

@@ -0,0 +1,137 @@
/**
* Copyright 2022 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
# tfdoc:file:description Data Platform stages resources.
# top-level Data Platform folder and service account
module "branch-dp-folder" {
source = "../../../modules/folder"
parent = "organizations/${var.organization.id}"
name = "Dataplatform"
}
#TODO check if I can delete those modules, Would you create a data-platform TF to run dev/prod?
# module "branch-dp-sa" {
# source = "../../../modules/iam-service-account"
# project_id = var.automation_project_id
# name = "resman-dp-0"
# description = "Terraform Data Platform production service account."
# prefix = local.prefixes.prod
# }
# module "branch-dp-gcs" {
# source = "../../../modules/gcs"
# project_id = var.automation_project_id
# name = "dp-0"
# prefix = local.prefixes.prod
# versioning = true
# iam = {
# "roles/storage.objectAdmin" = [module.branch-dp-sa.iam_email]
# }
# }
# environment: development folder
module "branch-dp-dev-folder" {
source = "../../../modules/folder"
parent = module.branch-dp-folder.id
# naming: environment descriptive name
name = "Data Platform - Development"
# environment-wide human permissions on the whole Data Platform environment
group_iam = {}
iam = {
# remove owner here and at project level if SA does not manage project resources
"roles/owner" = [
module.branch-dp-dev-sa.iam_email
]
"roles/logging.admin" = [
module.branch-dp-dev-sa.iam_email
]
"roles/resourcemanager.folderAdmin" = [
module.branch-dp-dev-sa.iam_email
]
"roles/resourcemanager.projectCreator" = [
module.branch-dp-dev-sa.iam_email
]
}
}
module "branch-dp-dev-sa" {
source = "../../../modules/iam-service-account"
project_id = var.automation_project_id
name = "resman-dp-dev-0"
# naming: environment in description
description = "Terraform Data Platform development service account."
prefix = local.prefixes.dev
}
module "branch-dp-dev-gcs" {
source = "../../../modules/gcs"
project_id = var.automation_project_id
name = "resman-dp-0"
prefix = local.prefixes.dev
versioning = true
iam = {
"roles/storage.objectAdmin" = [module.branch-dp-dev-sa.iam_email]
}
}
# environment: production folder
module "branch-dp-prod-folder" {
source = "../../../modules/folder"
parent = module.branch-dp-folder.id
# naming: environment descriptive name
name = "Data Platform - Production"
# environment-wide human permissions on the whole Data Platform environment
group_iam = {}
iam = {
# remove owner here and at project level if SA does not manage project resources
"roles/owner" = [
module.branch-dp-prod-sa.iam_email
]
"roles/logging.admin" = [
module.branch-dp-prod-sa.iam_email
]
"roles/resourcemanager.folderAdmin" = [
module.branch-dp-prod-sa.iam_email
]
"roles/resourcemanager.projectCreator" = [
module.branch-dp-prod-sa.iam_email
]
}
}
module "branch-dp-prod-sa" {
source = "../../../modules/iam-service-account"
project_id = var.automation_project_id
name = "resman-dp-0"
# naming: environment in description
description = "Terraform Data Platform production service account."
prefix = local.prefixes.prod
}
module "branch-dp-prod-gcs" {
source = "../../../modules/gcs"
project_id = var.automation_project_id
name = "resman-dp-0"
prefix = local.prefixes.prod
versioning = true
iam = {
"roles/storage.objectAdmin" = [module.branch-dp-prod-sa.iam_email]
}
}

View File

@@ -18,6 +18,11 @@
locals {
# set to the empty list if you remove the data platform branch
branch_dataplatform_pf_sa_iam_emails = [
module.branch-dp-dev-sa.iam_email,
module.branch-dp-prod-sa.iam_email
]
# set to the empty list if you remove the teams branch
branch_teams_pf_sa_iam_emails = [
module.branch-teams-dev-projectfactory-sa.iam_email,
@@ -58,7 +63,10 @@ module "organization" {
"roles/compute.xpnAdmin" = [
module.branch-network-sa.iam_email
]
"roles/orgpolicy.policyAdmin" = local.branch_teams_pf_sa_iam_emails
"roles/orgpolicy.policyAdmin" = concat(
local.branch_dataplatform_pf_sa_iam_emails,
local.branch_teams_pf_sa_iam_emails
)
},
local.billing_org ? {
"roles/billing.costsManager" = local.branch_teams_pf_sa_iam_emails
@@ -71,6 +79,7 @@ module "organization" {
# [
# for k, v in module.branch-teams-team-sa : v.iam_email
# ],
local.branch_dataplatform_pf_sa_iam_emails,
local.branch_teams_pf_sa_iam_emails
)
} : {}

View File

@@ -15,6 +15,10 @@
*/
locals {
_data_platform_sas = {
dev = module.branch-dp-dev-sa.iam_email
prod = module.branch-dp-prod-sa.iam_email
}
_project_factory_sas = {
dev = module.branch-teams-dev-projectfactory-sa.iam_email
prod = module.branch-teams-prod-projectfactory-sa.iam_email
@@ -30,6 +34,16 @@ locals {
name = "security"
sa = module.branch-security-sa.email
})
"03-data-platform-dev" = templatefile("${path.module}/../../assets/templates/providers.tpl", {
bucket = module.branch-dp-dev-gcs.name
name = "dp-dev"
sa = module.branch-dp-dev-sa.email
})
"03-data-platform-prod" = templatefile("${path.module}/../../assets/templates/providers.tpl", {
bucket = module.branch-dp-prod-gcs.name
name = "dp-prod"
sa = module.branch-dp-prod-sa.email
})
"03-project-factory-dev" = templatefile("${path.module}/../../assets/templates/providers.tpl", {
bucket = module.branch-teams-dev-projectfactory-gcs.name
name = "team-dev"
@@ -48,12 +62,14 @@ locals {
}
tfvars = {
"02-networking" = jsonencode({
data_platform_sa = local._data_platform_sas
folder_ids = {
networking = module.branch-network-folder.id
networking-dev = module.branch-network-dev-folder.id
networking-prod = module.branch-network-prod-folder.id
}
project_factory_sa = local._project_factory_sas
data_platform_sa = local._data_platform_sas
})
"02-security" = jsonencode({
folder_id = module.branch-security-folder.id
@@ -61,6 +77,14 @@ locals {
for k, v in local._project_factory_sas : k => [v]
}
})
"03-data-platform-dev" = jsonencode({
folder_id = module.branch-dp-dev-folder.id
date_platform_sa = module.branch-dp-dev-sa.iam_email
})
"03-data-platform-prod" = jsonencode({
folder_id = module.branch-dp-dev-folder.id
date_platform_sa = module.branch-dp-dev-sa.iam_email
})
}
}

View File

@@ -0,0 +1,33 @@
# skip boilerplate check
allow-dataflow-load-ingress-traffic:
description: "Allow traffic on Cloud Dataflow subnet"
direction: INGRESS
action: allow
sources: []
ranges:
- 10.10.0.0/24
- 10.10.1.0/24
targets: []
use_service_accounts: false
rules:
- protocol: tcp
ports:
- 12345
- 12346
allow-composer-health-checks:
description: "Allow Health Checks"
direction: INGRESS
action: allow
sources: []
ranges:
- 130.211.0.0/22
- 35.191.0.0/16
targets: []
use_service_accounts: false
rules:
- protocol: tcp
ports:
- 80
- 443

View File

@@ -0,0 +1,5 @@
# skip boilerplate check
region: europe-west1
description: Default subnet for dev Data Platform - Load layer Dataflow
ip_cidr_range: 10.10.0.0/24

View File

@@ -0,0 +1,8 @@
# skip boilerplate check
region: europe-west1
description: Default subnet for dev Data Platform - Orchestration layer Composer
ip_cidr_range: 172.18.16.0/24
secondary_ip_range :
pods: 172.18.24.0/22
services: 172.18.28.0/24

View File

@@ -0,0 +1,5 @@
# skip boilerplate check
region: europe-west1
description: Default subnet for dev Data Platform - Transformation layer Dataflow
ip_cidr_range: 10.10.1.0/24

View File

@@ -0,0 +1,5 @@
# skip boilerplate check
region: europe-west1
description: Default subnet for dev Data Platform - Load layer Dataflow
ip_cidr_range: 10.20.0.0/24

View File

@@ -0,0 +1,8 @@
# skip boilerplate check
region: europe-west1
description: Default subnet for dev Data Platform - Orchestration layer Composer
ip_cidr_range: 10.20.2.0/24
secondary_ip_range :
pods: 10.20.8.0/22
services: 10.20.12.0/24

View File

@@ -0,0 +1,5 @@
# skip boilerplate check
region: europe-west1
description: Default subnet for dev Data Platform - Transformation layer Dataflow
ip_cidr_range: 10.20.1.0/24

View File

@@ -89,5 +89,5 @@ module "landing-nat-ew1" {
router_create = true
router_name = "prod-nat-ew1"
router_network = module.landing-vpc.name
router_asn = 4200001024
router_asn = 65530
}

View File

@@ -27,6 +27,30 @@ locals {
shared_vpc_self_link = module.prod-spoke-vpc.self_link
vpc_host_project = module.prod-spoke-project.project_id
})
"03-data-platform-prod" = jsonencode({
network_self_link = module.prod-spoke-vpc.self_link
subnet_self_links = {
load = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-lod-ew1"].self_link
orchestration = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-orc-ew1"].self_link
transformation = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-trf-ew1"].self_link
}
})
"03-data-platform-prod" = jsonencode({
network_config = {
host_project = module.prod-spoke-project.project_id
network = module.prod-spoke-vpc.self_link
vpc_subnet_range = {
load = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-lod-ew1"].ip_cidr_range
orchestration = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-orc-ew1"].ip_cidr_range
transformation = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-trf-ew1"].ip_cidr_range
}
vpc_subnet_self_link = {
load = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-lod-ew1"].self_link
orchestration = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-orc-ew1"].self_link
transformation = module.prod-spoke-vpc.subnets["europe-west1/prod-dp-trf-ew1"].self_link
}
}
})
}
}

View File

@@ -27,6 +27,7 @@ module "dev-spoke-project" {
disable_dependent_services = false
}
services = [
"container.googleapis.com",
"compute.googleapis.com",
"dns.googleapis.com",
"iap.googleapis.com",
@@ -92,7 +93,7 @@ module "dev-spoke-cloudnat" {
name = "dev-nat-${local.region_trigram[each.value]}"
router_create = true
router_network = module.dev-spoke-vpc.name
router_asn = 4200001024
router_asn = 65530
logging_filter = "ERRORS_ONLY"
}
@@ -112,6 +113,7 @@ resource "google_project_iam_binding" "dev_spoke_project_iam_delegated" {
project = module.dev-spoke-project.project_id
role = "roles/resourcemanager.projectIamAdmin"
members = [
var.data_platform_sa.dev,
var.project_factory_sa.dev
]
condition {

View File

@@ -92,7 +92,7 @@ module "prod-spoke-cloudnat" {
name = "prod-nat-${local.region_trigram[each.value]}"
router_create = true
router_network = module.prod-spoke-vpc.name
router_asn = 4200001024
router_asn = 65530
logging_filter = "ERRORS_ONLY"
}
@@ -112,6 +112,7 @@ resource "google_project_iam_binding" "prod_spoke_project_iam_delegated" {
project = module.prod-spoke-project.project_id
role = "roles/resourcemanager.projectIamAdmin"
members = [
var.data_platform_sa.prod,
var.project_factory_sa.prod
]
condition {

View File

@@ -50,6 +50,13 @@ variable "data_dir" {
default = "data"
}
variable "data_platform_sa" {
# tfdoc:variable:source 01-resman
description = "IAM emails for Data Platform service accounts."
type = map(string)
default = {}
}
variable "dns" {
description = "Onprem DNS resolvers."
type = map(list(string))

View File

@@ -0,0 +1,6 @@
# Data Platform
The Data Platform (DP) builds on top of your foundations to create and set up projects (and related resources) to be used for your workloads.
It is organized in folders representing environments (e.g. "dev", "prod"), each implemented by a stand-alone terraform.
This directory contains a single DP ([`dev/`](./dev/)) as an example - to implement multiple environments (e.g. "prod" and "dev") you'll need to copy the `dev` folder into one folder per environment, then customize variables following the instructions found in [`dev/README.md`](./dev/README.md).

View File

@@ -0,0 +1,140 @@
# Data Platform
The Data Platform (DP) builds on top of your foundations to create and set up projects (and related resources) to be used for your data platform.
<p align="center">
<img src="diagram.png" alt="Data Platform diagram">
</p>
## Design overview and choices
The DP creates projects in a well-defined context, according to your resource management structure. Within the DP folder, resources are organized by environment.
Projects for each environment across different data layer are created to separate Service Account and Group roles. Roles are assigned at project level.
The Data Platform takes care of the following activities:
- Project creation
- API/Services enablement
- Service accounts creation
- IAM roles assignment for groups and service accounts
- KMS keys roles assignment
- Shared VPC attachment and subnets IAM binding
- Project-level org policies definition
- Billing setup (billing account attachment and budget configuration)
- Resource on each project to handle your data platform.
You can find more details on the DP implemented on the DP [README](../../../examples/data-solutions/data-platform-foundations/).
### User Groups
The DP rely on user groups to assign roles. They provide a stable frame of reference that allows decoupling the final set of permissions for each group, from the stage where entities and resources are created and their IAM bindings defined. [Here]((../../../examples/data-solutions/data-platform-foundations/#groups)) you can find more detail on users groups used by the DP.
### Network
The DP rely on the shared VPC defined on the `[02-networking](../../../02-network-vpn)` stage.
### Encryption
The DP may rely on Cloud KMS crypto keys created by the `[02-security](../../../02-security)` stage.
## How to run this stage
This stage is meant to be executed after "foundational stages" (i.e., stages [`00-bootstrap`](../../00-bootstrap), [`01-resman`](../../01-resman), [`02-networking`](../../02-networking) and [`02-security`](../../02-security)) have been run.
It's of course possible to run this stage in isolation, by making sure the architectural prerequisites are satisfied (e.g., networking), and that the Service Account running the stage is granted the roles/permissions below:
- One service account per environment, each with appropriate permissions
- at the organization level a custom role for networking operations including the following permissions
- `"compute.organizations.enableXpnResource"`,
- `"compute.organizations.disableXpnResource"`,
- `"compute.subnetworks.setIamPolicy"`,
- and role `"roles/orgpolicy.policyAdmin"`
- on each folder where projects are created
- `"roles/logging.admin"`
- `"roles/owner"`
- `"roles/resourcemanager.folderAdmin"`
- `"roles/resourcemanager.projectCreator"`
- on the host project for the Shared VPC
- `"roles/browser"`
- `"roles/compute.viewer"`
- VPC Host projects and their subnets should exist when creating projects
### Providers configuration
If you're running this on top of Fast, you should run the following commands to create the providers file, and populate the required variables from the previous stage.
```bash
# Variable `outputs_location` is set to `../../../config` in stage 01-resman
$ cd fabric-fast/stages/03-data-platform/dev
ln -s ../../../config/03-data-platform-dev/providers.tf
```
### Variable configuration
There are two broad sets of variables you will need to fill in:
- variables shared by other stages (org id, billing account id, etc.), or derived from a resource managed by a different stage (folder id, automation project id, etc.)
- variables specific to resources managed by this stage
To avoid the tedious job of filling in the first group of variables with values derived from other stages' outputs, the same mechanism used above for the provider configuration can be used to leverage pre-configured `.tfvars` files.
If you configured a valid path for `outputs_location` in the bootstrap and networking stage, simply link the relevant `terraform-*.auto.tfvars.json` files from this stage's outputs folder (under the path you specified), where the `*` above is set to the name of the stage that produced it. For this stage, a single `.tfvars` file is available:
```bash
# Variable `outputs_location` is set to `../../../config` in stages 01-bootstrap and 02-networking
ln -s ../../../config/03-data-platform-prod/terraform-bootstrap.auto.tfvars.json
ln -s ../../../config/03-data-platform-prod/terraform-networking.auto.tfvars.json
```
If you're not using Fast, refer to the [Variables](#variables) table at the bottom of this document for a full list of variables, their origin (e.g., a stage or specific to this one), and descriptions explaining their meaning.
Once the configuration is complete, run the project factory by running
```bash
terraform init
terraform apply
```
<!-- TFDOC OPTS files:1 show_extra:1 -->
<!-- BEGIN TFDOC -->
## Files
| name | description | modules | resources |
|---|---|---|---|
| [main.tf](./main.tf) | Data Platformy. | <code>data-platform-foundations</code> | |
| [outputs.tf](./outputs.tf) | Output variables. | | <code>local_file</code> |
| [providers.tf](./providers.tf) | Provider configurations. | | |
| [variables.tf](./variables.tf) | Terraform Variables. | | |
## Variables
| name | description | type | required | default | producer |
|---|---|:---:|:---:|:---:|:---:|
| [billing_account_id](variables.tf#L17) | Billing account id. | <code>string</code> | ✓ | | <code>00-bootstrap</code> |
| [folder_id](variables.tf#L66) | Folder to be used for the networking resources in folders/nnnn format. | <code>string</code> | ✓ | | <code>resman</code> |
| [network_config](variables.tf#L94) | Network configurations to use. Specify a shared VPC to use, if null networks will be created in projects. | <code title="object&#40;&#123;&#10; host_project &#61; string&#10; network &#61; string&#10; vpc_subnet_self_link &#61; object&#40;&#123;&#10; load &#61; string&#10; transformation &#61; string&#10; orchestration &#61; string&#10; &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | ✓ | | |
| [organization](variables.tf#L107) | Organization details. | <code title="object&#40;&#123;&#10; domain &#61; string&#10; id &#61; number&#10; customer_id &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | ✓ | | <code>00-bootstrap</code> |
| [prefix](variables.tf#L123) | Unique prefix used for resource names. Not used for projects if 'project_create' is null. | <code>string</code> | ✓ | | <code>00-bootstrap</code> |
| [composer_config](variables.tf#L23) | | <code title="object&#40;&#123;&#10; node_count &#61; number&#10; ip_range_cloudsql &#61; string&#10; ip_range_gke_master &#61; string&#10; ip_range_web_server &#61; string&#10; project_policy_boolean &#61; map&#40;bool&#41;&#10; region &#61; string&#10; ip_allocation_policy &#61; object&#40;&#123;&#10; use_ip_aliases &#61; string&#10; cluster_secondary_range_name &#61; string&#10; services_secondary_range_name &#61; string&#10; &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code title="&#123;&#10; node_count &#61; 3&#10; ip_range_cloudsql &#61; &#34;172.18.29.0&#47;24&#34;&#10; ip_range_gke_master &#61; &#34;172.18.30.0&#47;28&#34;&#10; ip_range_web_server &#61; &#34;172.18.30.16&#47;28&#34;&#10; project_policy_boolean &#61; &#123;&#10; &#34;constraints&#47;compute.requireOsLogin&#34; &#61; true&#10; &#125;&#10; region &#61; &#34;europe-west1&#34;&#10; ip_allocation_policy &#61; &#123;&#10; use_ip_aliases &#61; &#34;true&#34;&#10; cluster_secondary_range_name &#61; &#34;pods&#34;&#10; services_secondary_range_name &#61; &#34;services&#34;&#10; &#125;&#10;&#125;">&#123;&#8230;&#125;</code> | |
| [data_force_destroy](variables.tf#L54) | Flag to set 'force_destroy' on data services like BiguQery or Cloud Storage. | <code>bool</code> | | <code>false</code> | |
| [enable_cloud_nat](variables.tf#L60) | Network Cloud NAT flag. | <code>bool</code> | | <code>false</code> | |
| [groups](variables.tf#L72) | Groups. | <code>map&#40;string&#41;</code> | | <code title="&#123;&#10; data-analysts &#61; &#34;gcp-data-analysts&#34;&#10; data-engineers &#61; &#34;gcp-data-engineers&#34;&#10; data-security &#61; &#34;gcp-data-security&#34;&#10;&#125;">&#123;&#8230;&#125;</code> | |
| [location_config](variables.tf#L82) | Locations where resources will be deployed. Map to configure region and multiregion specs. | <code title="object&#40;&#123;&#10; region &#61; string&#10; multi_region &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code title="&#123;&#10; region &#61; &#34;europe-west1&#34;&#10; multi_region &#61; &#34;eu&#34;&#10;&#125;">&#123;&#8230;&#125;</code> | |
| [outputs_location](variables.tf#L117) | Path where providers, tfvars files, and lists for the following stages are written. Leave empty to disable. | <code>string</code> | | <code>null</code> | |
| [project_id](variables.tf#L129) | Project id, references existing project if `project_create` is null. | <code title="object&#40;&#123;&#10; landing &#61; string&#10; load &#61; string&#10; orchestration &#61; string&#10; trasformation &#61; string&#10; datalake-l0 &#61; string&#10; datalake-l1 &#61; string&#10; datalake-l2 &#61; string&#10; datalake-playground &#61; string&#10; common &#61; string&#10; exposure &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code title="&#123;&#10; landing &#61; &#34;lnd&#34;&#10; load &#61; &#34;lod&#34;&#10; orchestration &#61; &#34;orc&#34;&#10; trasformation &#61; &#34;trf&#34;&#10; datalake-l0 &#61; &#34;dtl-0&#34;&#10; datalake-l1 &#61; &#34;dtl-1&#34;&#10; datalake-l2 &#61; &#34;dtl-2&#34;&#10; datalake-playground &#61; &#34;dtl-plg&#34;&#10; common &#61; &#34;cmn&#34;&#10; exposure &#61; &#34;exp&#34;&#10;&#125;">&#123;&#8230;&#125;</code> | |
| [project_services](variables.tf#L157) | List of core services enabled on all projects. | <code>list&#40;string&#41;</code> | | <code title="&#91;&#10; &#34;cloudresourcemanager.googleapis.com&#34;,&#10; &#34;iam.googleapis.com&#34;,&#10; &#34;serviceusage.googleapis.com&#34;,&#10; &#34;stackdriver.googleapis.com&#34;&#10;&#93;">&#91;&#8230;&#93;</code> | |
## Outputs
| name | description | sensitive | consumers |
|---|---|:---:|---|
| [bigquery_datasets](outputs.tf#L35) | BigQuery datasets. | | |
| [demo_commands](outputs.tf#L65) | Demo commands. | | |
| [gcs_buckets](outputs.tf#L40) | GCS buckets. | | |
| [kms_keys](outputs.tf#L45) | Cloud MKS keys. | | |
| [projects](outputs.tf#L50) | GCP Projects informations. | | |
| [vpc_network](outputs.tf#L55) | VPC network. | | |
| [vpc_subnet](outputs.tf#L60) | VPC subnetworks. | | |
<!-- END TFDOC -->

Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

View File

@@ -0,0 +1,39 @@
/**
* Copyright 2022 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
# tfdoc:file:description Data Platformy.
locals {
_network_config = merge(
var.network_config_composer,
var.network_config
)
}
module "data-platform" {
source = "../../../../examples/data-solutions/data-platform-foundations"
billing_account_id = var.billing_account_id
composer_config = var.composer_config
data_force_destroy = var.data_force_destroy
folder_id = var.folder_id
groups = var.groups
network_config = local._network_config
organization_domain = var.organization_domain
prefix = var.prefix
project_services = var.project_services
region = var.region
service_encryption_keys = var.service_encryption_keys
}

View File

@@ -0,0 +1,61 @@
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# tfdoc:file:description Output variables.
locals {
tfvars = {}
}
resource "local_file" "tfvars" {
for_each = var.outputs_location == null ? {} : local.tfvars
filename = "${var.outputs_location}/${each.key}/terraform-dataplatform-dev.auto.tfvars.json"
content = each.value
}
# outputs
output "bigquery_datasets" {
description = "BigQuery datasets."
value = module.data-platform.bigquery-datasets
}
output "gcs_buckets" {
description = "GCS buckets."
value = module.data-platform.gcs-buckets
}
output "kms_keys" {
description = "Cloud MKS keys."
value = module.data-platform.kms_keys
}
output "projects" {
description = "GCP Projects informations."
value = module.data-platform.projects
}
output "vpc_network" {
description = "VPC network."
value = module.data-platform.vpc_network
}
output "vpc_subnet" {
description = "VPC subnetworks."
value = module.data-platform.vpc_subnet
}
output "demo_commands" {
description = "Demo commands."
value = module.data-platform.demo_commands
}

View File

@@ -0,0 +1,141 @@
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# tfdoc:file:description Terraform Variables.
variable "billing_account_id" {
# tfdoc:variable:source 00-bootstrap
description = "Billing account id."
type = string
}
variable "composer_config" {
type = object({
node_count = number
airflow_version = string
env_variables = map(string)
})
default = {
node_count = 3
airflow_version = "composer-1.17.5-airflow-2.1.4"
env_variables = {}
}
}
variable "data_force_destroy" {
description = "Flag to set 'force_destroy' on data services like BiguQery or Cloud Storage."
type = bool
default = false
}
variable "folder_id" {
# tfdoc:variable:source resman
description = "Folder to be used for the networking resources in folders/nnnn format."
type = string
}
variable "groups" {
description = "Groups."
type = map(string)
default = {
data-analysts = "gcp-data-analysts"
data-engineers = "gcp-data-engineers"
data-security = "gcp-data-security"
}
}
variable "network_config" {
description = "Network configurations to use. Specify a shared VPC to use, if null networks will be created in projects."
type = object({
host_project = string
network_self_link = string
subnet_self_links = object({
load = string
transformation = string
orchestration = string
})
})
}
variable "network_config_composer" {
description = "Network configurations to use for Composer."
type = object({
composer_ip_ranges = object({
cloudsql = string
gke_master = string
web_server = string
})
composer_secondary_ranges = object({
pods = string
services = string
})
})
default = {
composer_ip_ranges = {
cloudsql = "172.18.29.0/24"
gke_master = "172.18.30.0/28"
web_server = "172.18.30.16/28"
}
composer_secondary_ranges = {
pods = "pods"
services = "services"
}
}
}
variable "organization_domain" {
description = "Organization domain."
type = string
}
variable "outputs_location" {
description = "Path where providers, tfvars files, and lists for the following stages are written. Leave empty to disable."
type = string
default = null
}
variable "prefix" {
# tfdoc:variable:source 00-bootstrap
description = "Unique prefix used for resource names. Not used for projects if 'project_create' is null."
type = string
}
variable "project_services" {
description = "List of core services enabled on all projects."
type = list(string)
default = [
"cloudresourcemanager.googleapis.com",
"iam.googleapis.com",
"serviceusage.googleapis.com",
"stackdriver.googleapis.com"
]
}
variable "region" {
description = "Region used for regional resources."
type = string
default = "europe-west1"
}
variable "service_encryption_keys" { # service encription key
description = "Cloud KMS to use to encrypt different services. Key location should match service region."
type = object({
bq = string
composer = string
dataflow = string
storage = string
pubsub = string
})
default = null
}

View File

@@ -8,21 +8,21 @@ Refer to each stage's documentation for a detailed description of its purpose, t
## Organizational level (00-01)
- [Bootstrap](00-bootstrap/README.md)
- [Bootstrap](00-bootstrap/README.md)
Enables critical organization-level functionality that depends on broad permissions. It has two primary purposes. The first is to bootstrap the resources needed for automation of this and the following stages (service accounts, GCS buckets). And secondly, it applies the minimum amount of configuration needed at the organization level, to avoid the need of broad permissions later on, and to implement a minimum of security features like sinks and exports from the start.
- [Resource Management](01-resman/README.md)
- [Resource Management](01-resman/README.md)
Creates the base resource hierarchy (folders) and the automation resources required later to delegate deployment of each part of the hierarchy to separate stages. This stage also configures organization-level policies and any exceptions needed by different branches of the resource hierarchy.
## Shared resources (02)
- [Security](02-security/README.md)
- [Security](02-security/README.md)
Manages centralized security configurations in a separate stage, and is typically owned by the security team. This stage implements VPC Security Controls via separate perimeters for environments and central services, and creates projects to host centralized KMS keys used by the whole organization. It's meant to be easily extended to include other security-related resources which are required, like Secret Manager.
- Networking ([VPN](02-networking-vpn/README.md)/[NVA](02-networking-nva/README.md))
Manages centralized network resources in a separate stage, and is typically owned by the networking team. This stage implements a hub-and-spoke design, and includes connectivity via VPN to on-premises, and YAML-based factories for firewall rules (hierarchical and VPC-level) and subnets. It's currently available in two versions: [spokes connected via VPN](02-networking-vpn/README.md), [and spokes connected via appliances](02-networking-nva/README.md).
- [Networking](02-networking/README.md)
Manages centralized network resources in a separate stage, and is typically owned by the networking team. This stage implements a hub-and-spoke design, and includes connectivity via VPN to on-premises, and YAML-based factories for firewall rules (hierarchical and VPC-level) and subnets.
## Environment-level resources (03)
- [Project Factory](03-project-factory/README.md)
- [Project Factory](03-project-factory/README.md)
YAML-based fatory to create and configure application or team-level projects. Configuration includes VPC-level settings for Shared VPC, service-level configuration for CMEK encryption via centralized keys, and service account creation for workloads and applications. This stage is meant to be used once per environment.
- Data Platform (in development)
- GKE Multitenant (in development)