Files
hunfabric/fast
lcaggio bf64a3dfda Add Data Platform to FAST (#510)
* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* merge tools changes

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* add bolierplate to validate_schema

Co-authored-by: Julio Castillo <juliocc@users.noreply.github.com>

* stage 02-security

* Import Fast from dev repository.

Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* Copy FAST top level README

* Copy FAST top level README

* TODO list

* TODO list

* fix linting action to account for fast

* remove providers file

* add missing boilerplate

* update factory README

* align examples tfdoc

* fast readmes tfdoc

* disable markdown link check

* really disable markdown link check

* update TODO

* switch to local module refs in stage0

* replace module refs in 02-sec

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* merge tools changes

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* add bolierplate to validate_schema

Co-authored-by: Julio Castillo <juliocc@users.noreply.github.com>

* Import Fast from dev repository.
>
>
Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* stage 02-security

* Import Fast from dev repository.

Co-authored-by: Julio Castillo <jccb@google.com>
Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>

* Copy FAST top level README

* Copy FAST top level README

* TODO list

* TODO list

* fix linting action to account for fast

* remove providers file

* add missing boilerplate

* update factory README

* align examples tfdoc

* fast readmes tfdoc

* disable markdown link check

* really disable markdown link check

* update TODO

* switch to local module refs in stage0

* replace module refs in 02-sec

* Move first draft to fast branch

* Fix roles and variables. Add e2e DAG example!

* Fix example

* Fix KMS

* First draft: README

* Update README

* Add DLP, update README

* Update Readme

* README

* Add todos

* Merge master

* Merge master

* Merge master

* Fix and test KMS, Fix and test existing prj (it works also with single prj), Update README

* Fix READM and Demo

* add  on TF files

* Remove block comments

* simplify service_encryption_keys logic

* fix README

* Fix TODOs

* fix tfdoc description

* fix demo README

* fix sample files

* rename tf files

* Fix outputs file name, fix README, remove dependeces on composer resource

* Add test.

* Fix README.

* Initial README update

* README review

* Fix issues & readme

* Fix README

* Fix README

* Fix test error

* Fix test error

* Add datacatalog

* Fix test, for real? :-)

* fix readme

* support policy_boolean

* split Cloud NAT flag

* Fix README.

* Fix Shared VPC, first try :-)

* Fix tests and resource name

* fix tests

* fix tests

* README refactor

* Fix secondary range logic

* First commit

* Replace existing data platform

* Fix secondary range logic

* Fix README

* Replace DP example tests with the new one.

* Fix test module location.

* Fix test module location, for real.

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* Add PubSub encryption

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* Update READMEs

* Update README

* Fix composer roles and README.

* Fix test.

* Fixes.

* Add DLP documentation link.

* Temp commit with errors

* Refactor variables

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* rebase

* rebase

* rebase

* Rebase

* rebase

* Update READMEs

* Fixes.

* Fix new variables

* Fix misconfiguration and tests.

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Add TODO, VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Add TODO

* Fix module path

* Initial fix for KMS

* rebase

* rebase

* rebase

* Rebase

* rebase

* Update READMEs

* Fixes.

* Rebase - Fix secondary range logic

* Rebase - First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Possible improvement to handle VPC-SC perimeter projects with folder as variable

* Initial fix for KMS

* Fix secondary range logic

* First commit

* Support DataPlatform project in VPC-SC

* Fix VPC-SC

* Fix module path

* Initial fix for KMS

* Update READMEs

* Fixes.

* Fix new variables

* Revert VPC-SC logic

* Fix variable typos

* README fixes

* Fix Project Name logic

* Fix Linting

* READEME

* update READEME

* update READEME

* update README

* mandatory project creation, refactor

* formatting

* add TODO for service accounts descriptive name

* use project module to assign shared vpc roles

* Fix shared-vpc-project module

* Fix vpc name and tests

* README

* update to newer version

Co-authored-by: Ludovico Magnocavallo <ludomagno@google.com>
Co-authored-by: Simone Ruffilli <sruffilli@google.com>
Co-authored-by: Julio Castillo <juliocc@users.noreply.github.com>
Co-authored-by: Julio Castillo <jccb@google.com>
2022-02-11 17:32:16 +01:00
..
2022-02-11 17:32:16 +01:00

Fabric FAST

Setting up a production-ready GCP organization is often a time-consuming process. Fabric FAST aims to speed up this process via two complementary goals. On the one hand, FAST provides a design of a GCP organization that includes the typical elements required by enterprise customers. Secondly, we provide a reference implementation of the FAST design using Terraform.

Note that while our implementation is necessarily influenced (and constrained) by the way Terraform works, the design we put forward only refers to GCP constructs and features. In other words, while we use Terraform for our reference implementation, in theory, the FAST design can be implemented using any other tool (e.g., Pulumi, bash scripts, or even calling the relevant APIs directly).

Fabric FAST comes from engineers in Google Cloud's Professional Services Organization, with a combined experience of decades solving the typical technical problems faced by GCP customers. While every GCP user has specific requirements, many common issues arise repeatedly. Solving those issues correctly from the beginning is key to a robust and scalable GCP setup. It's those common issues and their solutions that Fabric FAST aims to collect and present coherently.

Fabric FAST was initially conceived to help enterprises quickly set up a GCP organization following battle-tested and widely-used patterns. Despite its origin in enterprise environments, FAST includes many customization points making it an ideal blueprint for organizations of all sizes, ranging from startups to the largest companies.

Guiding principles

Contracts and stages

FAST uses the concept of stages, which individually perform precise tasks but, taken together, build a functional, ready-to-use GCP organization. More importantly, stages are modeled around the security boundaries that typically appear in mature organizations. This arrangement allows delegating ownership of each stage to the team responsible for the types of resources it manages. For example, as its name suggests, the networking stage sets up all the networking elements and is usually the responsibility of a dedicated networking team within the organization.

From the perspective of FAST's overall design, stages also work as contacts or interfaces, defining a set of pre-requisites and inputs required to perform their designed task and generating outputs needed by other stages lower in the chain. The diagram below shows the relationships between stages.

Stages diagram

Security-first design

Security was, from the beginning, one of the most critical elements in the design of Fabric FAST. Many of FAST's design decisions aim to build the foundations of a secure organization. In fact, the first two stages deal mainly with the organization-wide security setup.

FAST also aims to minimize the number of permissions granted to principals according to the security-first approach previously mentioned. We achieve this through the meticulous use of groups, service accounts, custom roles, and Cloud IAM Conditions, among other things.

Extensive use of factories

A resource factory consumes a simple representation of a resource (e.g., in YAML) and deploys it (e.g., using Terraform). Used correctly, factories can help decrease the management overhead of large-scale infrastructure deployments. See "Resource Factories: A descriptive approach to Terraform" for more details and the rationale behind factories.

FAST uses YAML-based factories to deploy subnets and firewall rules and, as its name suggests, in the project factory stage.

Stages and high level design

As mentioned before, fast relies on multiple stages to progressively bring up your GCP organization(s). Please refer to the stages section for further details.

Implementation

There are many decisions and tasks required to convert an empty GCP organization to one that can host production environments safely. Arguably, FAST could expose those decisions as configuration options to allow for different outcomes. However, supporting all the possible combinations is almost impossible and leads to code which is hard to maintain efficiently.

Instead, FAST aims to leverage different reference architectures as “pluggable modules”, and then have a small set of variables covering only the essential options of each stage. While we could expose every option of the underlying resources as stage-level variables, we prefer to provide the basic implementation and encourage users to modify the codebase if additional (or different) behavior is needed.

Since we expect users to customize FAST to their specific needs, we strive to make its code easy to understand and modify. Root-level modules (i.e., stages) should be low in complexity, which among other things, means:

  • Code should avoid magic and be as explicit as possible.
  • We hide advanced features and complexity behind modules.
  • We prefer as little indirection as possible.
  • We favor flat over nested.

We also recognize that FAST users don't need all of its features. Therefore, you don't need to use our project factory or our GKE implementation if you don't want to. Instead, remove those stages or pieces of code and keep what suits you.

Those familiar with Python will note that FAST follows many of the maxims in the Zen of Python.

Roadmap

Besides the features already described, FAST roadmap includes:

  • Stage to deploy environment-specific multitenant GKE clusters following Google's best practices
  • Stage to deploy a fully featured data platform
  • Reference implementation to use FAST in CI/CD pipelines
  • Static policy enforcement