Monitoring and Logging for screenly/balena-prometheus-exporter

Monitoring Setup

To monitor the Balena application in production, follow these steps to set up the Screenly/Balena-Prometheus-Exporter.

Docker Configuration

Firstly, a Docker container is required that runs the exporter. The following Dockerfile can be used for building the container:

FROM python:3.10-alpine

WORKDIR /usr/src/app

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY main.py .

USER nobody

CMD [ "python", "./main.py" ]

This Dockerfile sets up the Python environment, installs required dependencies, and specifies the command to run the exporter.

Building and Running the Docker Image

Create the Docker image and run it with the necessary environment variable configuration. Use the following commands to build and run the container:

$ docker build -t balena-exporter .
$ docker run -d \
--name balena-exporter \
-p 8000:8000 \
-e BALENA_TOKEN=your_balena_token_here \
balena-exporter

After running the container, you can use curl to confirm it’s operating correctly:

curl http://localhost:8000/metrics

Configuration Parameters

The exporter supports an optional environment variable CRAWL_INTERVAL, which defaults to 60 seconds. This variable can be set when running the Docker container:

$ docker run -d \
--name balena-exporter \
-p 8000:8000 \
-e BALENA_TOKEN=your_balena_token_here \
-e CRAWL_INTERVAL=30 \
balena-exporter

Data Collection and Metrics Exposure

The main.py file contains the core functionality for collecting metrics from the Balena API and exposing them in a Prometheus-compatible format.

Key Classes and Methods

BalenaCollector Class: Handles the data collection from the Balena API.

class BalenaCollector(object):
    def __init__(self):
        pass

    def get_balena_fleets(self):
        ...
    
    def get_fleet_metrics(self, fleet_id):
        ...
    
    def collect(self):
        ...

Data Collection

The collect method is responsible for gathering metrics from different fleets. It utilizes get_balena_fleets to retrieve all accessible fleets and get_fleet_metrics to fetch metrics for each fleet:

def collect(self):
    gauge = GaugeMetricFamily(
        "balena_devices_online", "Devices by status", labels=["fleet_name"]
    )

    for fleet_id in self.get_balena_fleets():
        fleet_name, device_online_count = self.get_fleet_metrics(str(fleet_id))
        gauge.add_metric([fleet_name], float(device_online_count))

    return [gauge]

API Calls

The get_fleet_metrics method performs API calls to retrieve the count of online devices for each fleet.

def get_fleet_metrics(self, fleet_id):
    headers = {
        "Authorization": f"Bearer {BALENA_TOKEN}",
        "Content-Type": "application/json",
    }

    response = requests.get(
        f"https://api.balena-cloud.com/v6/application({fleet_id})?$expand=owns__device/$count($filter=is_online eq true)",
        headers=headers,
    )

    if not response.ok:
        print("Error: {}".format(response.text))
        sys.exit(1)

    device_online_count = response.json()["d"][0]["owns__device"]
    fleet_name = response.json()["d"][0]["app_name"]

    return fleet_name, device_online_count

Testing Metrics Collection

Unit tests ensure the functionality of the exporter. The tests/test_exporter.py file contains unit tests for verifying metrics collection:

def test_get_fleet_metrics(self):
    with mock.patch("main.requests.get") as mock_get:
        mock_get.return_value.ok = True
        mock_get.return_value.json.return_value = {"d": [{"owns__device": 3, "app_name": "test_fleet"}]}
        result = self.collector.get_fleet_metrics("fleet1")
        expected = ("test_fleet", 3)
        self.assertEqual(result, expected)

This test mocks the API response and checks if the metric collection is working as expected.

By following the outlined steps, the Balena application can be effectively monitored in a production environment, utilizing the capabilities of the Screenly/Balena-Prometheus-Exporter for real-time metrics collection and monitoring.

Sources:

Dockerfile
README.md
requirements.txt
main.py
tests/test_exporter.py