To set up comprehensive monitoring and logging for applications deployed on AWS EKS using Grafana, Prometheus, and the ELK stack (Elasticsearch, Logstash, Kibana), along with an alerting mechanism, follow these steps:
1. Setting Up Prometheus for Metrics Collection
1.1. Install Prometheus
Create a prometheus.yml configuration file:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
- job_name: 'application'
metrics_path: '/metrics'
static_configs:
- targets: ['<application-service>:<port>']
Deploy Prometheus using Helm:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus --namespace monitoring --create-namespace
Verify Prometheus Deployment:
kubectl get pods -n monitoring -l "app=prometheus"
Access the Prometheus UI via port-forwarding:
kubectl port-forward service/prometheus-server 9090:80 -n monitoring
1.2. Set Up Grafana for Visualization
Deploy Grafana using Helm:
helm repo add grafana https://grafana.github.io/helm-charts
helm install grafana grafana/grafana --namespace monitoring
Access Grafana:
Port-forward to access Grafana:
kubectl port-forward service/grafana 3000:80 -n monitoring
Open Grafana in your browser at http://localhost:3000.
Configure Grafana to Use Prometheus:
Log in to Grafana (default credentials: admin/admin).
Go to Configuration > Data Sources > Add data source.
Select Prometheus and set the URL to http://prometheus-server.monitoring.svc.cluster.local:80.
Create Dashboards:
Use existing templates or create your own dashboards in Grafana to visualize metrics from Prometheus.
2. Setting Up ELK Stack for Logging
2.1. Install Elasticsearch
Deploy Elasticsearch using Helm:
helm repo add elastic https://helm.elastic.co
helm install elasticsearch elastic/elasticsearch --namespace logging --create-namespace
Verify Elasticsearch Deployment:
Check the Elasticsearch pods:
kubectl get pods -n logging -l "app=elasticsearch"
2.2. Install Logstash
Deploy Logstash using Helm:
helm install logstash elastic/logstash --namespace logging
Configure Logstash:
Create a ConfigMap for Logstash configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: logstash-config
namespace: logging
data:
logstash.conf: |
input {
beats {
port => 5044
}
}
output {
elasticsearch {
hosts => ["http://elasticsearch:9200"]
index => "logs-%{+YYYY.MM.dd}"
}
}
Apply the ConfigMap:
kubectl apply -f logstash-configmap.yaml
Make sure Logstash is using this configuration by mounting the ConfigMap in the Logstash deployment.
2.3. Install Kibana
Deploy Kibana using Helm:
helm install kibana elastic/kibana --namespace logging
Access Kibana:
Port-forward to access Kibana:
kubectl port-forward service/kibana 5601:5601 -n logging
Open Kibana in your browser at http://localhost:5601.
2.4. Configure Log Shipping
If you’re using Fluentd or Fluent Bit to ship logs to Logstash, configure them to send logs to the Logstash input.
3. Setting Up Alerts
3.1. Configure Alerts in Prometheus
Define Alerting Rules:
Create an alert.rules.yml file:
groups:
- name: example
rules:
- alert: HighCpuUsage
expr: sum(rate(container_cpu_usage_seconds_total[1m])) by (container_name) > 0.9
for: 5m
labels:
severity: critical
annotations:
summary: "High CPU usage detected"
Update Prometheus Configuration:
Add the alert rules file to the Prometheus configuration (prometheus.yml):
rule_files:
- "/etc/prometheus/alert.rules.yml"
3.2. Set Up Alertmanager
Deploy Alertmanager using Helm:
helm install alertmanager prometheus-community/alertmanager --namespace monitoring
Configure Alertmanager:
Create a ConfigMap for Alertmanager configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: alertmanager-config
namespace: monitoring
data:
alertmanager.yml: |
global:
smtp_smarthost: 'smtp.example.com:587'
smtp_from: 'alertmanager@example.com'
smtp_auth_username: 'username'
smtp_auth_password: 'password'
route:
receiver: 'email-config'
receivers:
- name: 'email-config'
email_configs:
- to: 'you@example.com'
Apply the ConfigMap:
kubectl apply -f alertmanager-configmap.yaml
Update Prometheus to Use Alertmanager:
Add Alertmanager to the Prometheus configuration:
alerting:
alertmanagers:
- static_configs:
- targets:
- 'alertmanager.monitoring.svc.cluster.local:9093'
Test Alerts:
Create a test alert and check if notifications are sent correctly.
Summary
Metrics Collection:
Deploy Prometheus and configure it to scrape metrics.
Deploy Grafana and set it up to use Prometheus as a data source.
Create dashboards in Grafana for visualization.
Logging:
Deploy the ELK stack (Elasticsearch, Logstash, Kibana).
Configure Logstash to ingest logs and send them to Elasticsearch.
Use Kibana for log analysis.
Alerting:
Set up alerting rules in Prometheus.
Deploy Alertmanager and configure it for notifications.
Integrate Prometheus with Alertmanager for alerting.
By following these steps
, you’ll have a robust monitoring and logging setup for your AWS EKS applications, utilizing Grafana, Prometheus, and the ELK stack, complete with alerting capabilities.
Here’s a quick recap and additional considerations:
Recap
1. Prometheus for Metrics
Install Prometheus: Use Helm to deploy Prometheus.
Configure Prometheus: Create a configuration file and set up metrics scraping.
Deploy Grafana: Use Helm to deploy Grafana.
Configure Grafana: Connect Grafana to Prometheus and create dashboards.
2. ELK Stack for Logging
Install Elasticsearch: Use Helm for deployment.
Install Logstash: Deploy Logstash using Helm and configure it to ship logs to Elasticsearch.
Install Kibana: Deploy Kibana and use it to visualize logs.
3. Alerts
Define Alerting Rules: Create rules in Prometheus.
Set Up Alertmanager: Deploy and configure Alertmanager for notifications.
Integrate Prometheus and Alertmanager: Ensure Prometheus routes alerts to Alertmanager.
Additional Considerations
Security and Best Practices
Network Policies: Implement Kubernetes network policies to restrict traffic between pods based on their roles and requirements.
Access Controls: Use Kubernetes RBAC (Role-Based Access Control) to restrict access to Prometheus, Grafana, Elasticsearch, Logstash, and Kibana.
Data Persistence: Configure persistent storage for Elasticsearch to ensure data is not lost if the pod restarts.
Resource Management: Set resource limits and requests for Prometheus, Grafana, and ELK components to ensure efficient use of cluster resources.
Backup and Recovery: Implement backup solutions for Elasticsearch data to prevent data loss.
Scaling
Prometheus Scaling: Consider using Prometheus Federation or a more scalable monitoring solution if dealing with large-scale environments.
Elasticsearch Scaling: Ensure Elasticsearch nodes are appropriately sized and scaled to handle your logging volume. Use index lifecycle management (ILM) to manage data retention and storage.
Log Rotation: Configure Logstash and Elasticsearch to handle log rotation and prevent disk space exhaustion.
Integration with AWS Services
CloudWatch Integration: Consider integrating with AWS CloudWatch for additional metrics and logs.
IAM Roles: Ensure proper IAM roles and policies are applied for accessing AWS resources securely.
Cost Management: Monitor the costs associated with Prometheus, Grafana, and ELK stack deployments, as these can grow with increased data volume and query load.
By addressing these aspects, you’ll ensure a scalable, secure, and efficient monitoring and logging infrastructure for your AWS EKS applications.