Grafana Cloud ์ฒซ ์ฌ์ฉ๊ธฐ
CloudNet@์์ ์งํํ๊ณ ์๋ K8s Advanced Network Study(์ดํ, KANS)๋ฅผ ํตํด ํ์ตํ ๋ด์ฉ์ ์ ๋ฆฌํฉ๋๋ค.
์ด๋ฒ ์ฃผ์ฐจ๋ ์ค๊ฐ์ด ์์ง ์๋๋๋ฐ, ์คํฐ๋ ๋ง์ง๋ง ์ฃผ์ฐจ์
๋๋ค.
๊ทธ๋์ ์ฌ๋ฌ๋ถ์ด ์ ์๊ณ , ๋งค์ฐ ์ข์ํ๋ EKS๋ฅผ ํตํด, CoreDNS ์ด์๋ฅผ ๋ชจ๋ํฐ๋งํ๋ Hands-on์ ์ฐจ๊ทผ์ฐจ๊ทผ ๋ฐ๋ผํด๋ณด๋ ค๊ณ ํฉ๋๋ค.
์์ Blog๋ฅผ ๊ทธ๋๋ก ๋ฐ๋ผํด๋ณผ ๊ฒ๋๋ค.
0. EKS Cluster ์์ฑ
์คํฐ๋์์ ์ ๊ณต๋ CloudFormation์ ํตํด EKS Cluster๋ฅผ ์์ฑํด๋ณผ๊นํฉ๋๋ค.
eksctl์ด ์ธ๊ธ๋์ด ์์ด์ ์ ์ง… ๋์ค์ ๋กค๋ฐฑํ๊ณ ํ์ด๋ง์๋ถํฐ eksctl ๊ธฐ๋ฐ CloudFormation ๋ฐฐํฌ๋ฅผ ํ ๊ฒ ๊ฐ์ ๋ถ์ํจ์ด ์์ง๋ง ํด๋ณด์ฃ (?).
์ ์์ง์ ๊ธฐ์ฐ์๋ค์. ๊ธฐ์ต์ ๋์ง์ด๋ด๋ณด๋ bation host์์ eksctl ์ ์ฌ์ฉํด์ EKS Cluster ์์ฑํ๋ ๊ฒ๊น์ง ์คํฌ๋ฆฝํ
๋์ด ์๋ค๊ณ , ๋ง์์ ๋ค์๋ ๊ฒ ๊ฐ์ต๋๋ค.


์์ฑ๋ bastion์ ์ ์ํด์, ํ๊ฒฝ๋ณ์ ๋ฑ์ ํ์ธํด๋ณด๊ฒ ์ต๋๋ค.
ssh -i ~/.ssh/id_ed25519 [email protected] # BASTION-HOST-IP
# The authenticity of host '43.201.85.169 (43.201.85.169)' can't be established.
# ED25519 key fingerprint is SHA256:efFNF+24E7UUEzXzhqBDU0ss74yBmhGiaOI25XOVG9A.
# This key is not known by any other names.
# Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
# Warning: Permanently added '43.201.85.169' (ED25519) to the list of known hosts.
# , #_
# ~\_ ####_ Amazon Linux 2
# ~~ \_#####\
# ~~ \###| AL2 End of Life is 2025-06-30.
# ~~ \#/ ___
# ~~ V~' '->
# ~~~ / A newer version of Amazon Linux is available!
# ~~._. _/
# _/ _/ Amazon Linux 2023, GA and supported until 2028-03-15.
# _/m/' https://aws.amazon.com/linux/amazon-linux-2023/
# 10 package(s) needed for security, out of 13 available
# Run "sudo yum update" to apply all updates.
# (cm112@myeks:N/A) [root@myeks-bastion ~]# clear
tail -f /var/log/cloud-init-output.log \
# 66 โ php8.1 available [ =stable ]
# 67 awscli1 available [ =stable ]
# 68 โ php8.2 available [ =stable ]
# 69 dnsmasq available [ =stable ]
# 70 unbound1.17 available [ =stable ]
# 72 collectd-python3 available [ =stable ]
# โ Note on end-of-support. Use 'info' subcommand.
# Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
# cloudinit End!
# Cloud-init v. 19.3-46.amzn2.0.2 finished at Sat, 02 Nov 2024 09:44:24 +0000. Datasource DataSourceEc2. Up 91.81 seconds
^C
tail -f /root/create-eks.log
"availabilityZones": [
"ap-northeast-2c",
"ap-northeast-2b",
"ap-northeast-2a"
],
"cloudWatch": {
"clusterLogging": {}
}
}
^C
kubectl ns default
# Context "[email protected]" modified.
# Active namespace is "default".
eksctl get cluster
# NAME REGION EKSCTL CREATED
# myeks ap-northeast-2 True
# eksctl get nodegroup --cluster $CLUSTER_NAME
# CLUSTER NODEGROUP STATUS CREATED MIN SIZE MAX SIZEDESIRED CAPACITY INSTANCE TYPE IMAGE ID ASG NAME TYPE
# myeks ng1 ACTIVE 2024-11-02T09:55:58Z 3 3 3t3.medium AL2_x86_64 eks-ng1-2cc97626-bf01-5bcc-d680-091e003bd586 managed
export | egrep 'ACCOUNT|AWS_|CLUSTER|KUBERNETES|VPC|Subnet' | egrep -v 'SECRET|KEY'
# declare -x ACCOUNT_ID="<ACCOUNT-ID>"
# declare -x AWS_DEFAULT_REGION="ap-northeast-2"
# declare -x AWS_PAGER=""
# declare -x AWS_REGION="ap-northeast-2"
# declare -x CLUSTER_NAME="myeks"
# declare -x KUBERNETES_VERSION="1.30"
# declare -x PrivateSubnet1="subnet-044cf8b34576820ea"
# declare -x PrivateSubnet2="subnet-0ac2f3cd52e1ae640"
# declare -x PrivateSubnet3="subnet-0e5b144c0039c348b"
# declare -x PubSubnet1="subnet-0fef215562a97f319"
# declare -x PubSubnet2="subnet-0ca12b8db356bd486"
# declare -x PubSubnet3="subnet-01628d89d7c34590b"
# declare -x VPCID="vpc-0bcfa9363c4ff0069"
kubectl get node --label-columns=node.kubernetes.io/instance-type,eks.amazonaws.com/capacityType,topology.kubernetes.io/zone
# NAME STATUS ROLES AGE VERSION INSTANCE-TYPE CAPACITYTYPE ZONE
# ip-192-168-1-219.ap-northeast-2.compute.internal Ready <none> 12m v1.30.4-eks-a737599 t3.medium ON_DEMAND ap-northeast-2a
# ip-192-168-2-198.ap-northeast-2.compute.internal Ready <none> 12m v1.30.4-eks-a737599 t3.medium ON_DEMAND ap-northeast-2b
# ip-192-168-3-85.ap-northeast-2.compute.internal Ready <none> 12m v1.30.4-eks-a737599 t3.medium ON_DEMAND ap-northeast-2c
eksctl get iamidentitymapping --cluster myeks
# ARN USERNAME GROUPS ACCOUNT
# arn:aws:iam::<ACCOUNT-ID>:role/eksctl-myeks-nodegroup-ng1-NodeInstanceRole-bU6W7Cr0ugY5 system:node:{{EC2PrivateDNSName}} system:bootstrappers,system:nodes
eksctl get iamidentitymapping --cluster myeks
# ARN USERNAME GROUPS ACCOUNT
# arn:aws:iam::<ACCOUNT-ID>:role/eksctl-myeks-nodegroup-ng1-NodeInstanceRole-bU6W7Cr0ugY5 system:node:{{EC2PrivateDNSName}} system:bootstrappers,system:nodes
1. Hands-on์ ์ํ ํ๊ฒฝ ๊ตฌ์ฑ
์ด์ ๋ Hands-on์์ Pre-requisite๋ก ์๊ตฌํ๋ ํ๊ฒฝ๋ณ์๋ฅผ ์ถ๊ฐ๋ก ๊ตฌ์ฑํด๋ณด๊ฒ ์ต๋๋ค.
export EKS_CLUSTER_NAME=$(echo $CLUSTER_NAME)
export SERVICE=prometheusservice
export ACK_SYSTEM_NAMESPACE=ack-system
# export RELEASE_NAME=`curl -sL https://api.github.com/repos/aws-controllers-k8s/$SERVICE-controller/releases/latest | grep '"tag_name":' | cut -d'"' -f4`
์ฌ๊ธฐ์ ์ค๋๋ ํฌ์คํ
์ ์ด์๋ฅผ ๋ฐ๊ฒฌํ๊ฒ ๋๋๋ฐ,
GitHub REST API๊ฐ ๋ ์์ ํด์ก๊ธฐ ๋๋ฌธ์, REALASE_NAME ๊ฐ์ ธ์ค๋ ๊ฒ์ด ๋ถ๊ฐ๋ฅ์ ๊ฐ๊น์์ก์ต๋๋ค!
curl -sL https://api.github.com/repos/aws-controllers-k8s/$SERVICE-controller/releases/latest
# {
# "message": "Not Found",
# "documentation_url": "https://docs.github.com/rest/releases/releases#get-the-latest-release",
# "status": "404"
# }
๋ฐฐํธ๋งจ! ์ด๋ฅ๋ ฅ๋ ์๋ ์ฐ๋ฆฐ ๋ญ ํ ์ ์์ฃ ?

๋ณดํต์ด๋ฉด ๋ ํฌ๋ฅผ ๋น๊ฒจ์์, git tag๋ฅผ ํตํด ํ์ธํ๋๊ฒ ๋ง๋๋ฐ, ํธ์ฆ์จ์ด๋ ๋งํฌ๋ฅผ ์ด์ด์ ์ต์ ํ๊ทธ๋ฅผ ์ฐธ๊ณ ํ์๊ธฐ ๋ฐ๋๋๋ค.
์ ๋ ๋ฒ๊ฑฐ๋ก์์ 302 Found ์ฒ๋ฆฌํ์ฌ ํ๊ทธ ๋ฐ์์์ต๋๋ค.
curl -sS -I -G https://github.com/aws-controllers-k8s/$SERVICE-controller/releases/latest | grep -i location | awk -F'/' '{print $NF}'
# v1.2.15
export RELEASE_NAME=$(curl -sS -I -G https://github.com/aws-controllers-k8s/$SERVICE-controller/releases/latest | grep -i location | awk -F'/' '{print $NF}')
echo $RELEASE_NAME
# v1.2.15
2. Hands-On ๋ฌด์์ ๋ฐ๋ผํ๊ธฐ
(a) Amazon Managed Prometheus Workspace ์์ฑ
aws amp create-workspace --alias blog-workspace --region $AWS_REGION
# {
# "arn": "arn:aws:aps:ap-northeast-2:<ACCOUNT-ID>:workspace/ws-0d032a51-2b98-43b1-90cb-f5069329f1af",
# "status": {
# "statusCode": "CREATING"
# },
# "tags": {},
# "workspaceId": "ws-0d032a51-2b98-43b1-90cb-f5069329f1af"
# }

(b) Prometheus ethtool exporter ๋ฐฐํฌ
- ์๋ด๋๋๋ก exporter๋ฅผ ๋ฐฐํฌ๋ฌธ์ ์์ฑํด๋ณด๊ฒ ์ต๋๋ค.
cat << EOF > ethtool-exporter.yaml
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: ethtool-exporter
labels:
app: ethtool-exporter
spec:
updateStrategy:
rollingUpdate:
maxUnavailable: 100%
selector:
matchLabels:
app: ethtool-exporter
template:
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9417'
labels:
app: ethtool-exporter
spec:
hostNetwork: true
terminationGracePeriodSeconds: 0
containers:
- name: ethtool-exporter
env:
- name: IP
valueFrom:
fieldRef:
fieldPath: status.podIP
image: drdivano/ethtool-exporter@sha256:39e0916b16de07f62c2becb917c94cbb3a6e124a577e1325505e4d0cdd550d7b
command:
- "sh"
- "-exc"
- "python3 /ethtool-exporter.py -l \$(IP):9417 -I '(eth|em|eno|ens|enp)[0-9s]+'"
ports:
- containerPort: 9417
hostPort: 9417
name: http
resources:
limits:
cpu: 250m
memory: 100Mi
requests:
cpu: 10m
memory: 50Mi
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: v1
kind: Service
metadata:
labels:
app: ethtool-exporter
name: ethtool-exporter
spec:
clusterIP: None
ports:
- name: http
port: 9417
selector:
app: ethtool-exporter
EOF
kubectl apply -f ethtool-exporter.yaml
๋จ์ exporter๋๊น ๋ฐฐํฌ๋ ์ ๋ ๊ฒ ๊ฐ์ต๋๋ค.
kubectl get pods,svc -owide
# NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# pod/ethtool-exporter-b62vt 1/1 Running 0 51s 192.168.2.198 ip-192-168-2-198.ap-northeast-2.compute.internal <none> <none>
# pod/ethtool-exporter-jbdlx 1/1 Running 0 51s 192.168.1.219 ip-192-168-1-219.ap-northeast-2.compute.internal <none> <none>
# pod/ethtool-exporter-pj2r7 1/1 Running 0 51s 192.168.3.85 ip-192-168-3-85.ap-northeast-2.compute.internal <none> <none>
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
# service/ethtool-exporter ClusterIP None <none> 9417/TCP 51s app=ethtool-exporter
# service/kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 16h <none>
(c) ADOT(AWS Distro for OpenTelemetry) Collector ์๊ตฌ์ฌํญ ์ฒดํฌ
Pre-requisite ๋ฅผ ์ ์ํด์ผํ ๊ฒ ๊ฐ์ต๋๋ค.
-
Docs: Requirements for Getting Started with ADOT using EKS Add-Ons
-
kubectl, eksctl, AWS CLI v2 : ์ค์น ํ์ธ
-
Cluster ๋ฒ์ ํ์ธ : v1.21 ์ด์ ํ์ธ
kubectl version | grep "Server Version"
# Server Version: v1.30.6-eks-7f9249a
- ADOT add-on ํธํ ๋ฒ์ ํ์ธ : v0.62.1 ์ดํ ๋ฒ์ ์ด ์๋๋ฉด ๋ณ๋ ์์ ํ์์์.
aws eks describe-addon-versions --addon-name adot --kubernetes-version 1.30 --query 'addons[0].addonVersions[*].addonVersion'
# [
# "v0.102.1-eksbuild.2",
# "v0.102.1-eksbuild.1",
# "v0.102.0-eksbuild.1"
# ]
(d) ADOT Collector๋ฅผ ์ํ cert-manager ์ค์น
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml
kubectl get pod -n cert-manager
# NAME READY STATUS RESTARTS AGE
# cert-manager-cainjector-5dbdc949c4-r2wpn 1/1 Running 0 29s
# cert-manager-d68cffc95-wsx5c 1/1 Running 0 29s
# cert-manager-webhook-759ddb6555-fzl24 1/1 Running 0 29s
(e) ADOT Collector๋ฅผ ์ํ IRSA ์์ฑ
ํด๋น Policy ARN์ด ์ค์ ์กด์ฌํ๋์ง ์ ๋๋ ์ฒดํฌํ๊ณ ์์ฑํ๋ฉด ์ ์ ๊ฑด๊ฐ์ ์ข์ต๋๋ค.
echo :$AWS_REGION:$EKS_CLUSTER_NAME:
# :ap-northeast-2:myeks:
eksctl create iamserviceaccount \
--name adot-collector \
--namespace default \
--region $AWS_REGION \
--cluster $EKS_CLUSTER_NAME \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \
--approve \
--override-existing-serviceaccounts
# 2024-11-03 11:27:22 [โน] 1 iamserviceaccount (default/adot-collector) was included (based on the include/exclude rules)
# 2024-11-03 11:27:22 [!] metadata of serviceaccounts that exist in Kubernetes will be updated, as --override-existing-serviceaccounts was set
# 2024-11-03 11:27:22 [โน] 1 task: {
# 2 sequential sub-tasks: {
# create IAM role for serviceaccount "default/adot-collector",
# create serviceaccount "default/adot-collector",
# } }2024-11-03 11:27:22 [โน] building iamserviceaccount stack "eksctl-myeks-addon-iamserviceaccount-default-adot-collector"
# 2024-11-03 11:27:22 [โน] deploying stack "eksctl-myeks-addon-iamserviceaccount-default-adot-collector"
# 2024-11-03 11:27:22 [โน] waiting for CloudFormation stack "eksctl-myeks-addon-iamserviceaccount-default-adot-collector"
# 2024-11-03 11:27:52 [โน] waiting for CloudFormation stack "eksctl-myeks-addon-iamserviceaccount-default-adot-collector"
# 2024-11-03 11:27:52 [โน] created serviceaccount "default/adot-collector"
(f) ADOT add-on ์ค์น
์ด๋ฏธ ๋ฒ์ ์ฒดํฌ๋ฅผ ํด๋ณด์์ง๋ง, ๋ค์ ํด๋ด ์๋ค.
aws eks describe-addon-versions --addon-name adot --kubernetes-version 1.30 \
--query "addons[].addonVersions[].[addonVersion, compatibilities[].defaultVersion]" --output text
# v0.102.1-eksbuild.2
# True
# v0.102.1-eksbuild.1
# False
# v0.102.0-eksbuild.1
# False
aws eks create-addon --addon-name adot --addon-version v0.102.1-eksbuild.2 --cluster-name $EKS_CLUSTER_NAME
# {
# "addon": {
# "addonName": "adot",
# "clusterName": "myeks",
# "status": "CREATING",
# "addonVersion": "v0.102.1-eksbuild.2",
# "health": {
# "issues": []
# },
# "addonArn": "arn:aws:eks:ap-northeast-2:<ACCOUNT-ID>:addon/myeks/adot/eec977ee-84a1-85fe-ecbe-a2f51c90e9e7",
# "createdAt": "2024-11-03T11:31:33.678000+09:00",
# "modifiedAt": "2024-11-03T11:31:33.694000+09:00",
# "tags": {}
# }
# }
์ ๋๋ก ๋ฐฐํฌ๋์๋ ์ฒดํฌํด๋ณด๊ฒ ์ต๋๋ค.
kubectl get po -n opentelemetry-operator-system
# NAME READY STATUS RESTARTS AGE
# opentelemetry-operator-b7dbbdf7c-tqvfl 2/2 Running 0 64s
(g) ADOT Collector ๊ตฌ์ฑ
์๋์ ๊ฐ์ด collector-config-amp.yaml์ ์์ฑํ๊ณ , ๋ฐฐํฌํฉ๋๋ค.
- ํ๊ฒฝ๋ณ์ ์ ์ฒดํฌํด์ผํฉ๋๋ค.
AMP_REMOTE_WRITE_ENDPOINT: ๋จผ์ ์์ฑํ๋ ๊ทธ๊ฑฐ ๋ง์ต๋๋ค.AWS_REGIONEKS_CLUSTER_NAME
# export AMP_REMOTE_WRITE_ENDPOINT=<AMP_REMOTE_WRITE_ENDPOINT>
export AMP_REMOTE_WRITE_ENDPOINT=https://aps-workspaces.ap-northeast-2.amazonaws.com/workspaces/ws-0d032a51-2b98-43b1-90cb-f5069329f1af/api/v1/remote_write
echo $AMP_REMOTE_WRITE_ENDPOINT
# https://aps-workspaces.ap-northeast-2.amazonaws.com/workspaces/ws-0d032a51-2b98-43b1-90cb-f5069329f1af/api/v1/remote_write
cat > collector-config-amp.yaml <<EOF
---
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: my-collector-amp
spec:
mode: deployment
serviceAccount: adot-collector
podAnnotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8888'
resources:
requests:
cpu: "1"
limits:
cpu: "1"
config: |
extensions:
sigv4auth:
region: $AWS_REGION
service: "aps"
receivers:
#
# Scrape configuration for the Prometheus Receiver
# This is the same configuration used when Prometheus is installed using the community Helm chart
#
prometheus:
config:
global:
scrape_interval: 60s
scrape_timeout: 30s
external_labels:
cluster: $EKS_CLUSTER_NAME
scrape_configs:
- job_name: kubernetes-pods
scrape_interval: 15s
scrape_timeout: 5s
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: \$\$1:\$\$2
source_labels:
- __address__
- __meta_kubernetes_pod_annotation_prometheus_io_port
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
replacement: __param_\$\$1
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: kubernetes_namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: kubernetes_pod_name
- action: drop
regex: Pending|Succeeded|Failed|Completed
source_labels:
- __meta_kubernetes_pod_phase
processors:
batch/metrics:
timeout: 60s
exporters:
prometheusremotewrite:
endpoint: $AMP_REMOTE_WRITE_ENDPOINT
auth:
authenticator: sigv4auth
service:
extensions: [sigv4auth]
pipelines:
metrics:
receivers: [prometheus]
processors: [batch/metrics]
exporters: [prometheusremotewrite]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: otel-prometheus-role
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- ingresses
verbs:
- get
- list
- watch
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: otel-prometheus-role-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: otel-prometheus-role
subjects:
- kind: ServiceAccount
name: adot-collector
namespace: default
EOF
cat collector-config-amp.yaml | grep remote_write
# endpoint: https://aps-workspaces.ap-northeast-2.amazonaws.com/workspaces/ws-0d032a51-2b98-43b1-90cb-f5069329f1af/api/v1/remote_write
- ๋ฐฐํฌ…!
kubectl apply -f collector-config-amp.yaml
3. Grafana Cloud ๊ตฌ์ฑ
AMG ๋์ Grafana Cloud๋ฅผ ์ฌ์ฉํด๋ณด๊ฒ ์ต๋๋ค. ์ฌ์ฉ๋ฐฉ๋ฒ์ ๋งค์ฐ ๊ฐ๋จ!
(a) Grafana Cloud ๊ฐ์
๋ค ์ด๊ฑฐ ๊ฐ์ ํด๋ณธ ์ ์ด ์์ด์ ์ ์ด๋ณด์์ต๋๋ค.
(b) ํ๋ฌ๊ทธ์ธ ํ์ฑํ with ๋ธ๊น

(c) Prometheus Datasource ์ค์
-
๊ฒฝ๋ก: Home > Connections > Data sources > grafana-amazonprometheus-datasource
-
URL ์์:
https://aps-workspaces.ap-northeast-2.amazonaws.com/workspaces/ws-0d032a51-2b98-43b1-90cb-f5069329f1af- ๋์
/api/v1/query๋ฃ์๋ค๊ฐ ๊ณ์ ์๋ฌ๋์ ๋ญ๊ฐ ํ๋ค์. - Docs: Add the Prometheus data source in Grafana
- ๋์

- Auth: AWS์ SigV4๋ฅผ ์ง์ํฉ๋๋ค.
- Access Key ID / Secret Access Key ์ ๋ ฅ.
- Assume Role์ ๊ฒฝ์ฐ, Preview๋ฅผ ์ํ ํฐ์ผ์ ํ๊ธฐ์ ์๊ฐ์ด ์์ด Skip.

- Additional: ๋ฆฌ์ ๊ณผ TLS ์ค์

4. Query ์์ฑ ํ ํ์ธ

๋ธ๋ก๊ทธ์ ๋์จ ๋๋ก, ์ ์ ์ถ๋ ฅ๋๋ ๊ฒ์ ํ์ธํ์์ต๋๋ค.
9. Vaporware
์ค๊ฐ์ ํจ๋์ด ๊ฑธ๋ ค์, Grafana Cloud์ ์ฐ๊ฒฐ ๋ชฉ์ ์ผ๋ก Grafana Cloud Agent๋ฅผ ๋ฐฐํฌํด์ผํ๋ ํ๋๋ฐ, ๊ธฐ์ฐ์์ต๋๋ค. ๊ทธ๋ ๊ฒ ํ์๋ ์์ด๋ณด์ ๋๋ค.
- Configuring Grafana Cloud Agent for Amazon Managed Service for Prometheus/AWS Open Source Blog
- Amazon Managed Service for Prometheus/Grafana Labs Plugins
(a) ๊ถํ ๋ถ์ฌ ์์
curl https://gist.githubusercontent.com/rfratto/b6c5888e89faed3b04fa2533e0bec1a2/raw/bb9aa5e560009e98b48861d0b2ce54fc8a4303e6/script.bash -o agent-permissions-aks.bash
sed -i "s/YOUR_EKS_CLUSTER_NAME/${EKS_CLUSTER_NAME}/g" agent-permissions-aks.bash
cat agent-permissions-aks.bash | head -n 4
# ##!/bin/bash
# CLUSTER_NAME=myeks # SEE THIS LINE IF CHANGED TO YOUR CLUSTER NAME
# AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
# OIDC_PROVIDER=$(aws eks describe-cluster --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")
์คํํฉ์๋ค.
ls -al agent-permissions-aks.bash
# -rw-r--r-- 1 root root 4341 Nov 3 12:09 agent-permissions-aks.bash
chmod u+x agent-permissions-aks.bash
ls -al agent-permissions-aks.bash
# -rwxr--r-- 1 root root 4341 Nov 3 12:09 agent-permissions-aks.bash
./agent-permissions-aks.bash
์ค๊ฐ ์ค๊ฐ, error์ ์ฌ์ฐํ์ง๋ง, ์์ฑ์ ๋๊ฑฐ ๊ฐ์ต๋๋ค.
# ./agent-permissions-aks.bash
Creating a new trust policy
An error occurred (NoSuchEntity) when calling the GetRole operation: The role with name EKS-GrafanaAgent-AMP-ServiceAccount-Role cannot be found.
Appending to the existing trust policy
An error occurred (NoSuchEntity) when calling the GetPolicy operation: Policy arn:aws:iam::<ACCOUNT-ID>:policy/AWSManagedPrometheusWriteAccessPolicy was not found.
Creating a new permission policy AWSManagedPrometheusWriteAccessPolicy
{
"Policy": {
"PolicyName": "AWSManagedPrometheusWriteAccessPolicy",
"PolicyId": "ANPASTWNT54JUITZSLWOX",
"Arn": "arn:aws:iam::<ACCOUNT-ID>:policy/AWSManagedPrometheusWriteAccessPolicy",
"Path": "/",
"DefaultVersionId": "v1",
"AttachmentCount": 0,
"PermissionsBoundaryUsageCount": 0,
"IsAttachable": true,
"CreateDate": "2024-11-03T03:16:10+00:00",
"UpdateDate": "2024-11-03T03:16:10+00:00"
}
}
An error occurred (NoSuchEntity) when calling the GetRole operation: The role with name EKS-GrafanaAgent-AMP-ServiceAccount-Role cannot be found.
EKS-GrafanaAgent-AMP-ServiceAccount-Role role does not exist. Creating a new role with a trust and permission policy
arn:aws:iam::<ACCOUNT-ID>:role/EKS-GrafanaAgent-AMP-ServiceAccount-Role
2024-11-03 12:16:16 [โน] IAM Open ID Connect provider is already associated with cluster "myeks" in "ap-northeast-2"
(b) Grafana Cloud Agent ๋ฐฐํฌ
๋ธ๋ก๊ทธ์ฒ๋ผ ํด๋ณด๋ ค๊ณ ํ๋๋ฐ, ํด๋น install-sigv4.sh ํ์ผ์ด v0.18.4 ๊น์ง๋ง ์ง์ํ๋ ๊ฒ์ด์ด์ ํด๋ณด๊ณ ์๋๋ฉด ์ข
๋ฃํ๊ฒ ์ต๋๋ค.
kubectl create namespace grafana-agent; \
WORKSPACE="ws-0d032a51-2b98-43b1-90cb-f5069329f1af" \
ROLE_ARN="arn:aws:iam::<ACCOUNT-ID>:role/EKS-GrafanaAgent-AMP-ServiceAccount-Role" \
REGION="ap-northeast-2" \
NAMESPACE="grafana-agent" \
REMOTE_WRITE_URL="https://aps-workspaces.$REGION.amazonaws.com/workspaces/$WORKSPACE/api/v1/remote_write" \
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/grafana/agent/v0.18.4/production/kubernetes/install-sigv4.sh)" | kubectl apply -f -
# namespace/grafana-agent created
# serviceaccount/grafana-agent created
# configmap/grafana-agent created
# configmap/grafana-agent-deployment created
# daemonset.apps/grafana-agent created
# deployment.apps/grafana-agent-deployment created
# resource mapping not found for name: "grafana-agent" namespace: "" from "STDIN": no matches for kind "ClusterRole" in version "rbac.authorization.k8s.io/v1beta1"
# ensure CRDs are installed first
# resource mapping not found for name: "grafana-agent" namespace: "" from "STDIN": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
# ensure CRDs are installed first
๊ทธ๋์ ์๋์ ๊ฐ์ด ์๋์ผ๋ก ์ฌ๊ตฌ์ฑํด์ ๋ฐฐํฌ ๋ค์ ํ์ต๋๋ค.
- ์์ ๋ ๋ถ๋ถ
- ๊ธฐ์กด: v1beta1
- ์์ : v1
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: grafana-agent
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: grafana-agent
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: grafana-agent
subjects:
- kind: ServiceAccount
name: grafana-agent
namespace: grafana-agent
kkumtree
Source code on GitHub
ยฉ 2025 kkumtree and contributors All rights reserved.
Licensed under
CC BY-NC-ND 4.0