Add support for bundling net monitoring tool in a Docker image, and deploying via CR Job (#2609)
* dockerfile and reqs update * deployment via cloud run jobs * README * boilerplate
This commit is contained in:
committed by
GitHub
parent
bbe84a5ca8
commit
74427386b9
@@ -0,0 +1,23 @@
|
||||
# Copyright 2024 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
FROM python:3-slim-bookworm
|
||||
|
||||
COPY src /app/
|
||||
RUN pip install -r /app/requirements.txt
|
||||
RUN chmod 755 /app/main.py
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
ENTRYPOINT ["./main.py"]
|
||||
@@ -8,13 +8,13 @@ The tool tracks several distinct usage types across a variety of resources: proj
|
||||
|
||||
The screenshot below is an example of a simple dashboard provided with this blueprint, showing utilization for a specific metric (number of instances per VPC) for multiple VPCs and projects:
|
||||
|
||||
<img src="metric.png" width="640px">
|
||||
<img src="metric.png" width="640px" alt="diagram">
|
||||
|
||||
One other example is the IP utilization information per subnet, allowing you to monitor the percentage of used IP addresses in your GCP subnets.
|
||||
|
||||
More complex scenarios are possible by leveraging and combining the 50 different timeseries created by this tool, and connecting them to Cloud Operations dashboards and alerts.
|
||||
|
||||
Refer to the [Cloud Function deployment instructions](./deploy-cloud-function/) for a high level overview and an end-to-end deployment example, and to the[discovery tool documentation](./src/) to try it as a standalone program or to package it in alternative ways.
|
||||
Refer to the [Cloud Function](./deploy-cloud-function/) or [Cloud Run Job](./deploy-cloudrun-job/) instructions for a high level overview and end-to-end deployment examples, and to the[discovery tool documentation](./src/) to try it as a standalone program or to package it in alternative ways.
|
||||
|
||||
## Metrics created
|
||||
|
||||
|
||||
@@ -0,0 +1,59 @@
|
||||
# Network Quota Monitoring via Cloud Run Job
|
||||
|
||||
This simple Terraform setup allows deploying the [discovery tool for the Network Dashboard](../src/) to a Cloud Run Job triggered by Cloud Scheduler.
|
||||
|
||||
For service configuration refer to the [Cloud Function deployment](../deploy-cloud-function/) as the underlying monitoring scraper is the same.
|
||||
|
||||
## Creating and uploading the Docker container
|
||||
|
||||
To build the container run `docker build` in the parent folder, then tag and push it to the URL printed in outputs.
|
||||
|
||||
## Example configuration
|
||||
|
||||
This is an example of a working configuration, where the discovery root is set at the org level, but resources used to compute timeseries need to be part of the hierarchy of two specific folders:
|
||||
|
||||
```tfvars
|
||||
discovery_config = {
|
||||
discovery_root = "organizations/1234567890"
|
||||
monitored_folders = ["3456789012", "7890123456"]
|
||||
}
|
||||
grant_discovery_iam_roles = true
|
||||
project_create_config = {
|
||||
billing_account_id = "12345-ABCDEF-12345"
|
||||
parent_id = "folders/2345678901"
|
||||
}
|
||||
project_id = "my-project"
|
||||
|
||||
# tftest modules=5 resources=27
|
||||
```
|
||||
|
||||
## Monitoring dashboard
|
||||
|
||||
A monitoring dashboard can be optionally be deployed int the same project by setting the `dashboard_json_path` variable to the path of a dashboard JSON file. A sample dashboard is in included, and can be deployed with this variable configuration:
|
||||
|
||||
```tfvars
|
||||
dashboard_json_path = "../dashboards/quotas-utilization.json"
|
||||
```
|
||||
<!-- BEGIN TFDOC -->
|
||||
## Variables
|
||||
|
||||
| name | description | type | required | default |
|
||||
|---|---|:---:|:---:|:---:|
|
||||
| [discovery_config](variables.tf#L23) | Discovery configuration. Discovery root is the organization or a folder. If monitored folders and projects are empty, every project under the discovery root node will be monitored. | <code title="object({ discovery_root = string monitored_folders = optional(list(string), []) monitored_projects = optional(list(string), []) })">object({…})</code> | ✓ | |
|
||||
| [project_id](variables.tf#L69) | Project id where the tool will be deployed. | <code>string</code> | ✓ | |
|
||||
| [dashboard_json_path](variables.tf#L17) | Optional monitoring dashboard to deploy. | <code>string</code> | | <code>null</code> |
|
||||
| [grant_discovery_iam_roles](variables.tf#L41) | Optionally grant required IAM roles to the monitoring tool service account. | <code>bool</code> | | <code>false</code> |
|
||||
| [monitoring_project](variables.tf#L48) | Project where generated metrics will be written. Default is to use the same project where the Cloud Function is deployed. | <code>string</code> | | <code>null</code> |
|
||||
| [name](variables.tf#L54) | Name used to create resources. | <code>string</code> | | <code>"netmon"</code> |
|
||||
| [project_create_config](variables.tf#L60) | Optional configuration if project creation is required. | <code title="object({ billing_account_id = string parent_id = optional(string) })">object({…})</code> | | <code>null</code> |
|
||||
| [region](variables.tf#L74) | Compute region where Cloud Run will be deployed. | <code>string</code> | | <code>"europe-west1"</code> |
|
||||
| [schedule_config](variables.tf#L80) | Scheduler configuration. Region is only used if different from the one used for Cloud Run. | <code title="object({ crontab = optional(string, "*/30 * * * *") region = optional(string) })">object({…})</code> | | <code>{}</code> |
|
||||
|
||||
## Outputs
|
||||
|
||||
| name | description | sensitive |
|
||||
|---|---|:---:|
|
||||
| [docker_tag](outputs.tf#L17) | Docker tag for the container image. | |
|
||||
| [project_id](outputs.tf#L22) | Project id. | |
|
||||
| [service_account](outputs.tf#L27) | Cloud Run Job service account. | |
|
||||
<!-- END TFDOC -->
|
||||
@@ -0,0 +1,157 @@
|
||||
/**
|
||||
* Copyright 2024 Google LLC
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
# TODO: support custom quota file
|
||||
|
||||
locals {
|
||||
discovery_roles = ["roles/compute.viewer", "roles/cloudasset.viewer"]
|
||||
}
|
||||
|
||||
module "project" {
|
||||
source = "../../../../modules/project"
|
||||
name = var.project_id
|
||||
billing_account = try(var.project_create_config.billing_account_id, null)
|
||||
parent = try(var.project_create_config.parent_id, null)
|
||||
project_create = var.project_create_config != null
|
||||
services = [
|
||||
"artifactregistry.googleapis.com",
|
||||
"cloudasset.googleapis.com",
|
||||
"cloudscheduler.googleapis.com",
|
||||
"compute.googleapis.com",
|
||||
"monitoring.googleapis.com",
|
||||
"run.googleapis.com"
|
||||
]
|
||||
}
|
||||
|
||||
module "ar" {
|
||||
source = "../../../../modules/artifact-registry"
|
||||
project_id = module.project.project_id
|
||||
location = var.region
|
||||
name = var.name
|
||||
format = { docker = { standard = {} } }
|
||||
}
|
||||
|
||||
module "sa" {
|
||||
source = "../../../../modules/iam-service-account"
|
||||
project_id = module.project.project_id
|
||||
name = var.name
|
||||
display_name = "Net monitoring service."
|
||||
iam_project_roles = {
|
||||
(module.project.project_id) = [
|
||||
"roles/monitoring.metricWriter"
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
module "sa-invoker" {
|
||||
source = "../../../../modules/iam-service-account"
|
||||
project_id = module.project.project_id
|
||||
name = "${var.name}-invoker"
|
||||
display_name = "Net monitoring service invoker."
|
||||
}
|
||||
|
||||
module "cr-job" {
|
||||
source = "../../../../modules/cloud-run-v2"
|
||||
project_id = module.project.project_id
|
||||
name = var.name
|
||||
region = var.region
|
||||
create_job = true
|
||||
containers = {
|
||||
netmon = {
|
||||
image = "${module.ar.url}/${var.name}"
|
||||
args = concat(
|
||||
[
|
||||
"-dr",
|
||||
var.discovery_config.discovery_root,
|
||||
"-mon",
|
||||
coalesce(var.monitoring_project, module.project.project_id)
|
||||
],
|
||||
flatten([
|
||||
for f in var.discovery_config.monitored_folders : [
|
||||
"-f", f
|
||||
]
|
||||
]),
|
||||
flatten([
|
||||
for f in var.discovery_config.monitored_projects : [
|
||||
"-p", f
|
||||
]
|
||||
])
|
||||
)
|
||||
}
|
||||
}
|
||||
iam = {
|
||||
"roles/run.invoker" = [
|
||||
module.sa-invoker.iam_email
|
||||
]
|
||||
}
|
||||
revision = {
|
||||
job = {
|
||||
max_retries = 0
|
||||
}
|
||||
}
|
||||
service_account = module.sa.email
|
||||
deletion_protection = false
|
||||
}
|
||||
|
||||
resource "google_cloud_scheduler_job" "job" {
|
||||
name = var.name
|
||||
description = "Schedule net monitor job."
|
||||
schedule = var.schedule_config.crontab
|
||||
time_zone = "UTC"
|
||||
attempt_deadline = "320s"
|
||||
region = coalesce(var.schedule_config.region, var.region)
|
||||
project = module.project.project_id
|
||||
retry_config {
|
||||
retry_count = 1
|
||||
}
|
||||
http_target {
|
||||
http_method = "POST"
|
||||
uri = "https://${var.region}-run.googleapis.com/apis/run.googleapis.com/v1/namespaces/${module.project.number}/jobs/${var.name}:run"
|
||||
oauth_token {
|
||||
service_account_email = module.sa-invoker.email
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
resource "google_organization_iam_member" "discovery" {
|
||||
for_each = toset(
|
||||
var.grant_discovery_iam_roles &&
|
||||
startswith(var.discovery_config.discovery_root, "organizations/")
|
||||
? local.discovery_roles
|
||||
: []
|
||||
)
|
||||
org_id = split("/", var.discovery_config.discovery_root)[1]
|
||||
role = each.key
|
||||
member = module.sa.iam_email
|
||||
}
|
||||
|
||||
resource "google_folder_iam_member" "discovery" {
|
||||
for_each = toset(
|
||||
var.grant_discovery_iam_roles &&
|
||||
startswith(var.discovery_config.discovery_root, "folders/")
|
||||
? local.discovery_roles
|
||||
: []
|
||||
)
|
||||
folder = var.discovery_config.discovery_root
|
||||
role = each.key
|
||||
member = module.sa.iam_email
|
||||
}
|
||||
|
||||
resource "google_monitoring_dashboard" "dashboard" {
|
||||
count = var.dashboard_json_path == null ? 0 : 1
|
||||
project = var.project_id
|
||||
dashboard_json = file(var.dashboard_json_path)
|
||||
}
|
||||
@@ -0,0 +1,30 @@
|
||||
/**
|
||||
* Copyright 2024 Google LLC
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
output "docker_tag" {
|
||||
description = "Docker tag for the container image."
|
||||
value = "${module.ar.url}/${var.name}"
|
||||
}
|
||||
|
||||
output "project_id" {
|
||||
description = "Project id."
|
||||
value = module.project.project_id
|
||||
}
|
||||
|
||||
output "service_account" {
|
||||
description = "Cloud Run Job service account."
|
||||
value = module.sa.email
|
||||
}
|
||||
@@ -0,0 +1,87 @@
|
||||
/**
|
||||
* Copyright 2022 Google LLC
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
variable "dashboard_json_path" {
|
||||
description = "Optional monitoring dashboard to deploy."
|
||||
type = string
|
||||
default = null
|
||||
}
|
||||
|
||||
variable "discovery_config" {
|
||||
description = "Discovery configuration. Discovery root is the organization or a folder. If monitored folders and projects are empty, every project under the discovery root node will be monitored."
|
||||
type = object({
|
||||
discovery_root = string
|
||||
monitored_folders = optional(list(string), [])
|
||||
monitored_projects = optional(list(string), [])
|
||||
# custom_quota_file = optional(string)
|
||||
})
|
||||
nullable = false
|
||||
validation {
|
||||
condition = (
|
||||
var.discovery_config.monitored_folders != null &&
|
||||
var.discovery_config.monitored_projects != null
|
||||
)
|
||||
error_message = "Monitored folders and projects can be empty lists, but they cannot be null."
|
||||
}
|
||||
}
|
||||
|
||||
variable "grant_discovery_iam_roles" {
|
||||
description = "Optionally grant required IAM roles to the monitoring tool service account."
|
||||
type = bool
|
||||
default = false
|
||||
nullable = false
|
||||
}
|
||||
|
||||
variable "monitoring_project" {
|
||||
description = "Project where generated metrics will be written. Default is to use the same project where the Cloud Function is deployed."
|
||||
type = string
|
||||
default = null
|
||||
}
|
||||
|
||||
variable "name" {
|
||||
description = "Name used to create resources."
|
||||
type = string
|
||||
default = "netmon"
|
||||
}
|
||||
|
||||
variable "project_create_config" {
|
||||
description = "Optional configuration if project creation is required."
|
||||
type = object({
|
||||
billing_account_id = string
|
||||
parent_id = optional(string)
|
||||
})
|
||||
default = null
|
||||
}
|
||||
|
||||
variable "project_id" {
|
||||
description = "Project id where the tool will be deployed."
|
||||
type = string
|
||||
}
|
||||
|
||||
variable "region" {
|
||||
description = "Compute region where Cloud Run will be deployed."
|
||||
type = string
|
||||
default = "europe-west1"
|
||||
}
|
||||
|
||||
variable "schedule_config" {
|
||||
description = "Scheduler configuration. Region is only used if different from the one used for Cloud Run."
|
||||
type = object({
|
||||
crontab = optional(string, "*/30 * * * *")
|
||||
region = optional(string)
|
||||
})
|
||||
default = {}
|
||||
}
|
||||
@@ -17,6 +17,7 @@
|
||||
import base64
|
||||
import binascii
|
||||
import collections
|
||||
import functools
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
@@ -29,13 +30,17 @@ import yaml
|
||||
|
||||
from google.auth.transport.requests import AuthorizedSession
|
||||
|
||||
HTTP = AuthorizedSession(google.auth.default()[0])
|
||||
LOGGER = logging.getLogger('net-dash')
|
||||
MONITORING_ROOT = 'netmon/'
|
||||
|
||||
Result = collections.namedtuple('Result', 'phase resource data')
|
||||
|
||||
|
||||
@functools.cache
|
||||
def _http():
|
||||
return AuthorizedSession(google.auth.default()[0])
|
||||
|
||||
|
||||
def do_discovery(resources):
|
||||
'''Calls discovery plugin functions and collect discovered resources.
|
||||
|
||||
@@ -198,10 +203,10 @@ def fetch(request):
|
||||
LOGGER.debug(f'fetch {"POST" if request.data else "GET"} {request.url}')
|
||||
try:
|
||||
if not request.data:
|
||||
response = HTTP.get(request.url, headers=request.headers)
|
||||
response = _http().get(request.url, headers=request.headers)
|
||||
else:
|
||||
response = HTTP.post(request.url, headers=request.headers,
|
||||
data=request.data)
|
||||
response = _http().post(request.url, headers=request.headers,
|
||||
data=request.data)
|
||||
except google.auth.exceptions.RefreshError as e:
|
||||
raise SystemExit(e.args[0])
|
||||
if response.status_code != 200:
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
click==8.1.3
|
||||
google-auth==2.14.1
|
||||
PyYAML==6.0
|
||||
requests==2.32.0
|
||||
click>=8.1.3
|
||||
google-auth>=2.14.1
|
||||
PyYAML>=6.0
|
||||
requests>=2.32.0
|
||||
|
||||
Reference in New Issue
Block a user