Production monitoring is critical for maintaining the performance and reliability of the project. Below are the steps and key components used to monitor the project in a production environment.
Instrumentation and Metrics
Prometheus Configuration
Prometheus is utilized for collecting metrics. The configuration is defined in the
prometheus.yml
file. Here’s an example of the relevant configuration for scraping application metrics:scrape_configs: - job_name: 'gitlab-app' static_configs: - targets: ['localhost:8080']
This configuration enables Prometheus to scrape data from the application running on port 8080.
Metrics Collection
Metrics are exposed via an HTTP endpoint. The code snippet below illustrates how to set up the metrics endpoint within a Ruby on Rails application:
require 'prometheus/client' prometheus = Prometheus::Client.registry # Create a new counter request_counter = Prometheus::Client::Counter.new(:http_requests_total, docstring: 'A counter of HTTP requests made.') prometheus.register(request_counter) class ApplicationController < ActionController::Base def process_action(method_name, *args) request_counter.increment(labels: { method: request.method, path: request.path }) super end end
The above code ensures that every incoming HTTP request increments the
http_requests_total
counter.
Log Aggregation
Structured Logging
Structured logging facilitates log parsing and querying. The following Ruby code demonstrates how to implement structured logging using the
logger
gem:require 'logger' class CustomLogger def initialize @logger = Logger.new(STDOUT) end def log_request(request) @logger.info( { time: Time.now, method: request.method, path: request.path, status: response.status }.to_json ) end end
This implementation logs requests in JSON format, allowing easier integration with log management systems.
Centralized Logging Setup
For centralized log aggregation, configure a logging service such as Elasticsearch or a logging platform. The following example demonstrates how to configure a Fluentd input:
<source> @type tail path /var/log/gitlab/*.log pos_file /var/log/gitlab/fluentd.pos tag gitlab.* format json </source>
With this setup, Fluentd will watch for log entries in the specified log files and send the structured logs to the central log management system.
Alerting
Alert Rules in Prometheus
Alerting rules can be defined within the
prometheus.yml
file to monitor specific metrics. Here is an example of an alert rule configuration that triggers when the error rate exceeds a threshold:groups: - name: application_alerts rules: - alert: HighErrorRate expr: rate(http_requests_total{status="500"}[5m]) > 0.1 for: 10m labels: severity: critical annotations: summary: "High error rate detected" description: "The error rate is above 10% for the last 10 minutes."
This rule helps to quickly identify and respond to high error rates within the application.
Integrating with Alerting Systems
Alerts can be routed to tools like PagerDuty or Slack. The following example illustrates how to configure Alertmanager to send alerts to Slack:
receivers: - name: 'slack-notifications' slack_configs: - api_url: '<YOUR_SLACK_WEBHOOK_URL>' channel: '#alerts' text: '{{ .CommonAnnotations.summary }}: {{ .CommonAnnotations.description }}'
This configuration allows immediate notification to the specified Slack channel when alerts are triggered.
Health Checks
Application Health Endpoint
Implementing health checks is essential for monitoring the application’s availability. Below is an example of a health check endpoint in Rails:
get '/health' do status 200 body 'OK' end
This endpoint can be monitored by external tools to ensure that the application is running properly.
Kubernetes Probes
For applications running on Kubernetes, liveness and readiness probes should be defined. An example Kubernetes deployment configuration might look like this:
spec: containers: - name: gitlab-app image: gitlab/gitlab-app:latest livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 15 periodSeconds: 5
This setup ensures that Kubernetes can automatically manage the container’s lifecycle based on its health status.
These steps outline the key components and configurations for production monitoring, ensuring high performance, reliability, and responsiveness to issues that arise in the application environment.
Source: Internal project documentation.