Grafana + Prometheus 모니터링 구축


Prometheus는 메트릭 수집 및 저장 시스템이고, Grafana는 데이터 시각화 도구입니다. 함께 사용하면 서버, 애플리케이션, 인프라 모니터링 대시보드를 구축할 수 있습니다.



언제 사용하나요?



  • 서버 CPU, 메모리, 디스크 모니터링

  • 애플리케이션 성능 지표 수집

  • 실시간 알림 설정

  • 시계열 데이터 시각화



Docker Compose 설치


# docker-compose.yml
version: "3.8"

services:
prometheus:
image: prom/prometheus:v2.48.0
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.retention.time=15d"
restart: unless-stopped

grafana:
image: grafana/grafana:10.2.2
container_name: grafana
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin123
depends_on:
- prometheus
restart: unless-stopped

node_exporter:
image: prom/node-exporter:v1.7.0
container_name: node_exporter
ports:
- "9100:9100"
restart: unless-stopped

volumes:
prometheus_data:
grafana_data:


Prometheus 설정


# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s

alerting:
alertmanagers:
- static_configs:
- targets: []

scrape_configs:
# Prometheus 자체 모니터링
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]

# 서버 메트릭 (Node Exporter)
- job_name: "node"
static_configs:
- targets: ["node_exporter:9100"]

# Spring Boot 애플리케이션
- job_name: "spring-app"
metrics_path: "/actuator/prometheus"
static_configs:
- targets: ["app:8080"]

# 여러 서버 모니터링
- job_name: "web-servers"
static_configs:
- targets:
- "server1:9100"
- "server2:9100"
- "server3:9100"


Spring Boot 연동


<!-- 의존성 추가 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

# application.yml
management:
endpoints:
web:
exposure:
include: health,info,prometheus
endpoint:
prometheus:
enabled: true
metrics:
tags:
application: ${spring.application.name}


커스텀 메트릭 추가


@Component
public class CustomMetrics {

private final Counter orderCounter;
private final Gauge activeUsers;
private final Timer responseTimer;

public CustomMetrics(MeterRegistry registry) {
// 카운터: 주문 수
this.orderCounter = Counter.builder("app.orders.total")
.description("총 주문 수")
.tag("type", "order")
.register(registry);

// 게이지: 현재 활성 사용자
AtomicInteger users = new AtomicInteger(0);
this.activeUsers = Gauge.builder("app.users.active", users, AtomicInteger::get)
.description("활성 사용자 수")
.register(registry);

// 타이머: API 응답 시간
this.responseTimer = Timer.builder("app.api.response.time")
.description("API 응답 시간")
.register(registry);
}

public void recordOrder() {
orderCounter.increment();
}

public void measureApiCall(Runnable task) {
responseTimer.record(task);
}
}


Grafana 대시보드 설정


1. Grafana 접속: http://localhost:3000
2. Data Sources > Add > Prometheus
URL: http://prometheus:9090
3. Dashboards > Import

# 추천 대시보드 ID
- Node Exporter Full: 1860
- Spring Boot Statistics: 11378
- JVM Micrometer: 4701

# PromQL 쿼리 예시
# CPU 사용률
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# 메모리 사용률
(1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100

# HTTP 요청 수
rate(http_server_requests_seconds_count[5m])

# 평균 응답 시간
rate(http_server_requests_seconds_sum[5m]) / rate(http_server_requests_seconds_count[5m])


알림 설정 (Alertmanager)


# alertmanager.yml
global:
slack_api_url: "https://hooks.slack.com/services/..."

route:
receiver: "slack-notifications"
group_by: ["alertname"]

receivers:
- name: "slack-notifications"
slack_configs:
- channel: "#alerts"
text: "{{ .CommonAnnotations.summary }}"

# Prometheus alert rules
groups:
- name: app-alerts
rules:
- alert: HighCpuUsage
expr: (1 - avg(rate(node_cpu_seconds_total{mode="idle"}[5m]))) * 100 > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"

- alert: HighMemoryUsage
expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
for: 5m
labels:
severity: critical
annotations:
summary: "High memory usage detected"


실행 및 확인


# 시작
docker-compose up -d

# 확인
# Prometheus: http://localhost:9090
# Grafana: http://localhost:3000 (admin/admin123)
# Node Exporter: http://localhost:9100/metrics