본문 바로가기

Develop/DevOps

[Prometheus] Prometheus HA를 위한 Thanos 배포

반응형

 

개요

Prometheus는 로컬 저장소(local-path-provisioner)를 사용해서 데이터를 저장하고, HA(고가용성)환경을 만들기엔 어려운 리소스입니다. 따라서 HA환경을 구성하기 위해서 필요한것이 Thanos입니다.

제가 작업한 내용은 다음과 같습니다. 모든 내용은 terraform으로 작성했습니다.

  • kube-prometheus-stack 배포 + node-exporter를 추가로 배포하게끔 values.yaml파일 수정

https://artifacthub.io/packages/helm/prometheus-community/kube-prometheus-stack

 

kube-prometheus-stack 70.3.0 · prometheus/prometheus-community

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

artifacthub.io

https://artifacthub.io/packages/helm/bitnami/minio

 

minio 15.0.7 · bitnami/bitnami

MinIO(R) is an object storage server, compatible with Amazon S3 cloud storage service, mainly used for storing unstructured data (such as photos, videos, log files, etc.).

artifacthub.io

  • Thanos와 MinIO를 연결하여 Thanos 배포를 진행

https://artifacthub.io/packages/helm/bitnami/thanos

 

thanos 15.13.2 · bitnami/bitnami

Thanos is a highly available metrics system that can be added on top of existing Prometheus deployments, providing a global query view across all Prometheus installations.

artifacthub.io

 

Prometheus-stack 배포

아래 처럼 배포하게되면 node-exporter가 daemonsets으로 생성된다.

resource "kubernetes_namespace_v1" "prometheus_stack" {
  metadata {
    name = "kube-prometheus-stack"
  }
}

locals {
  monitor_labels = {
    "${kubernetes_namespace_v1.prometheus_stack.metadata[0].name}/prometheus" = "enabled"
  }
}

resource "helm_release" "prometheus_stack" {
  repository  = "https://prometheus-community.github.io/helm-charts"
  chart       = "kube-prometheus-stack"
  version     = "66.3.1"
  max_history = 5
  name        = "prometheus-stack"
  namespace   = kubernetes_namespace_v1.prometheus_stack.metadata[0].name
  timeout     = 300
  values = [jsonencode({
    fullnameOverride = "prometheus-stack"
    defaultRules = {
      create = false
    }
    windowsMonitoring         = { enabled = false }
    alertmanager              = { enabled = false }
    grafana                   = { enabled = false }
    kubernetesServiceMonitors = { enabled = true }
    kubeApiServer             = { enabled = false }
    kubelet = {
      enabled = true
      serviceMonitor = {
        additionalLabels = local.monitor_labels
      }
    }
    kubeControllerManager = { enabled = false }
    coreDns               = { enabled = false }
    kubeDns               = { enabled = false }
    kubeEtcd              = { enabled = false }
    kubeScheduler         = { enabled = false }
    kubeProxy             = { enabled = false }
    kubeStateMetrics      = { enabled = true }
    kube-state-metrics = {
      prometheus = {
        monitor = {
          additionalLabels = local.monitor_labels
        }
      }
    }
    nodeExporter = { enabled = true }
    prometheus-node-exporter = {
      fullnameOverride = "node-exporter"
      prometheus = {
        monitor = {
          enabled          = true
          additionalLabels = local.monitor_labels
          jobLabel         = "jobLabel"
        }
      }
      podLabels = {
        jobLabel = "node-exporter"
      }
    }
    prometheusOperator = {
      enabled = true
      tls = {
        enabled = false
      }
      serviceMonitor = {
        selfMonitor = false
      }
    }
    prometheus  = { enabled = false }
    thanosRuler = { enabled = false }
  })]
}

Prometheus CR배포

prometheus-stack을 설치하면 prometheus operator도 설치하게되는데 이후 아래사이트에 명시된 CR들을 사용할수있게됩니다.

https://prometheus-operator.dev/docs/api-reference/api/

 

API reference

Prometheus operator generated API reference docs

prometheus-operator.dev

 

Prometheus를 생성할때는 아래 사이트의 RBAC도 같이 생성해줘야한다.

https://prometheus-operator.dev/docs/platform/rbac/#prometheus-rbac

 

RBAC

Role-based access control for the Prometheus operator

prometheus-operator.dev

또한 Prometheus의 데이터 저장은 local-path-provisioner를 통해서 호스트에 저장되게끔 설정합니다.

resource "kubernetes_manifest" "prometheus_prometheus" {
  manifest = {
    apiVersion = "monitoring.coreos.com/v1"
    kind       = "Prometheus"
    metadata = {
      name      = "prometheus"
      namespace = var.prometheus_stack_namespace
    }
    spec = {
      replicas = 2
      externalLabels = {
        "cluster" = "storage"
      }
      # for connect thanos
      thanos = {
        # default open gRPC port 10901
        blockSize = "2h"
      }
      serviceAccountName = kubernetes_service_account.prometheus.metadata[0].name
      initContainers = [
        {
          name    = "prometheus-permission"
          image   = "busybox"
          command = ["/bin/chmod", "-R", "777", "/prometheus"]
          volumeMounts = [
            {
              name      = "prometheus-prometheus-db"
              mountPath = "/prometheus"
            }
          ]
        }
      ]
      storage = {
        volumeClaimTemplate = {
          spec = {
            accessModes = ["ReadWriteOnce"]
            resources = {
              requests = {
                storage = "10Gi"
              }
            }
            storageClassName = "local-path"
          }
        }
      }
      serviceMonitorNamespaceSelector = {}
      serviceMonitorSelector = {
        matchLabels = {
          "${var.prometheus_stack_namespace}/prometheus" = "enabled"
        }
      }
      ruleNamespaceSelector = {}
      ruleSelector = {
        matchLabels = {
          "${var.prometheus_stack_namespace}/prometheus" = "enabled"
        }
      }
      podMonitorNamespaceSelector = {}
      podMonitorSelector = {
        matchLabels = {
          "${var.prometheus_stack_namespace}/prometheus" = "enabled"
        }
      }
      alerting = {
        alertmanagers = [
          {
            namespace = var.prometheus_stack_namespace
            name      = "alertmanager-operator"
            port      = "web"
          }
        ]
      }
    }
  }
  wait {
    condition {
      type   = "Available"
      status = "True"
    }
  }
}

Thanos의 장기 저장소로 사용하기 위한 MinIO배포

참고로 standalone으로 배포를 안하면 thanos bucket이 초기생성이 안된다는점을 기억하자. distributed mode로 생성할경우엔 minio provider로 bucket생성이 필요함.

resource "helm_release" "minio" {
  repository  = "https://charts.bitnami.com/bitnami" # 확인이 필요함 에러발생 가능성 존재
  chart       = "minio"
  version     = "14.9.0"
  max_history = 5
  name        = "minio"
  namespace   = "thanos"
  timeout     = 300
  values = [jsonencode({
    mode = "standalone"
    auth = {
      rootUser     = "admin"
      rootPassword = "admin12345"
    }
    defaultBuckets = "thanos"
    persistence = {
      enabled      = true
      storageClass = "" # 상황에 따라서 지정
      size         = "500Gi"
    }
  })]
}

Thanos배포

objstore라는 secret을 생성하지않으면 정상작동안하니 꼭 생성해줘야한다.

query부분에 연결하려는 prometheus service를 적어줘야한다.

resource "kubernetes_secret_v1" "objstore" {
  metadata {
    name      = "objstore"
    namespace = kubernetes_namespace_v1.thanos.metadata[0].name
  }

  data = {
    "objstore.yml" = <<-EOT
      type: s3
      config:
        bucket: "thanos"
        endpoint: "minio.${kubernetes_namespace_v1.thanos.metadata[0].name}.svc.cluster.local:9000"
        insecure: true
        access_key: "admin"
        secret_key: "admin12345"
    EOT
  }

  type = "Opaque"
}

resource "helm_release" "thanos" {
  repository  = "https://charts.bitnami.com/bitnami"
  chart       = "thanos"
  version     = "15.13.2"
  max_history = 5
  name        = "thanos"
  namespace   = kubernetes_namespace_v1.thanos.metadata[0].name
  timeout     = 300
  values = [jsonencode({
    existingObjstoreSecret = kubernetes_secret_v1.objstore.metadata[0].name
    query = {
      enabled = true
      stores = [
        "prometheus-operated.${var.prometheus_stack_namespace}.svc.cluster.local:10901"
      ]
    }
    queryFrontend = {
      enabled = true
    }
    bucketweb = {
      enabled = false
    }
    compactor = {
      enabled = true
      persistence = {
        storageClass = "" # 상황에 따라서 사용하는 stroageClass지정
      }
    }
    ruler = {
      enabled = false
    }
    receive = {
      enabled = false
    }
    receiveDistributor = {
      enabled = false
    }
    metrics = {
      enabled = false
    }
    minio = {
      enabled = false
    }
  })]
}
반응형