Prometheus with HashiCorp Nomad#
Run an OpenTelemetry Collector on each Nomad client (or wherever Nomad’s HTTP API is reachable). It scrapes Nomad’s built-in Prometheus-format metrics and remote_writes to your Ametnes Prometheus endpoint. Nomad serves metrics at /v1/metrics with format=prometheus, not a separate /metrics port.
Prerequisites#
- Prometheus provisioned on Ametnes Platform.
- At least one Nomad client is registered (
nomad node statusshows a ready node). - After telemetry is enabled, Nomad exposes metrics at
http://127.0.0.1:4646/v1/metrics?format=prometheus. - Docker is available to the Nomad
dockertask driver if you use the OpenTelemetry Collector job below.
1. Enable Nomad Prometheus metrics#
In the Nomad agent configuration (for example under /etc/nomad.d/):
telemetry {
prometheus_metrics = true
publish_allocation_metrics = true
publish_node_metrics = true
}
Restart Nomad, then on the host verify:
2. OpenTelemetry Collector job (Nomad + Docker)#
Run otel/opentelemetry-collector-contrib, scrape Nomad locally, and remote-write to Ametnes Prometheus.
Important details:
- Use
network_mode = "host"so127.0.0.1:4646refers to the host’s Nomad HTTP API, not the container’s loopback. - Render the collector config with a Nomad
templateblock (for example tolocal/config.yaml). - In the template body, escape OpenTelemetry’s
${env:VAR}as$${env:VAR}so Nomad’s HCL parser does not consume${...}.
Example job sketch:
job "otel-collector" {
datacenters = ["dc1"]
type = "service"
group "otel" {
task "otelcontrib" {
driver = "docker"
config {
image = "otel/opentelemetry-collector-contrib:<version>"
network_mode = "host"
args = ["--config=/local/config.yaml"]
}
template {
destination = "local/config.yaml"
data = <<EOH
extensions:
basicauth/prw:
client_auth:
username: $${env:PRW_USER}
password: $${env:PRW_PASSWORD}
receivers:
prometheus:
config:
scrape_configs:
- job_name: nomad
scrape_interval: 15s
static_configs:
- targets: ["127.0.0.1:4646"]
metrics_path: "/v1/metrics"
params:
format: ["prometheus"]
processors:
batch: {}
exporters:
prometheusremotewrite:
endpoint: https://<ametnes-prometheus-endpoint>/api/v1/write
auth:
authenticator: basicauth/prw
service:
extensions: [basicauth/prw]
pipelines:
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [prometheusremotewrite]
EOH
}
env {
PRW_USER = "<username>"
PRW_PASSWORD = "<password>" # prefer Nomad Variables / Vault in production
}
resources {
cpu = 500
memory = 512
}
}
}
}
Deploy and inspect:
nomad job run otel-collector.nomad
nomad job status otel-collector
nomad alloc logs -f <alloc-id> otelcontrib
Replace https://<ametnes-prometheus-endpoint>/api/v1/write and PRW_USER / PRW_PASSWORD with the endpoint and credentials from your Ametnes Platform Prometheus resource.
3. Verify metrics in Prometheus (read API)#
Query remote Ametnes Prometheus:
curl -sS -G \
-u '<user>:<password>' \
--data-urlencode 'query=nomad_client_allocations_blocked' \
'https://<ametnes-prometheus-endpoint>/api/v1/query' | jq .
Success looks like "status":"success" with a non-empty result, and typical labels such as job="nomad", instance="127.0.0.1:4646", plus node_id, datacenter, and related labels as emitted by Nomad.
HashiCorp Nomad Dashboard#
Import the community HashiCorp Nomad dashboard JSON file. You need a Grafana Prometheus data source pointed at your Ametnes Prometheus instance before the dashboard can query metrics.
- Dashboard source: HashiCorp Nomad (ID 15764)
- Add Prometheus as a data source (skip if you already completed this): in Grafana go to Connections → Data sources → Add data source → Prometheus. Set URL to your Ametnes Prometheus base URL (no trailing slash), enable Basic auth with the User and Password from your Prometheus resource, then Save & test until the health check succeeds. For the full sequence, see Connect Grafana to Prometheus.
- Go to Dashboards → New → Import.
- Paste the dashboard JSON content (or upload the JSON file) in the import screen.
- When prompted, choose the Prometheus data source you added above.
- Click Import.
Adjust the dashboard’s variables or job selectors if your remote-write labels differ from the dashboard defaults.
For a more composable and repeatable process, Infrastructure as Code (IaC) is usually a better option.
Grafana’s Terraform provider can create the Prometheus data source and the dashboard, but it does not replicate the UI import step where you pick a data source from a dropdown. The dashboard JSON must already reference a data source that exists in Grafana (usually by uid in each panel’s datasource block). In practice you either:
- Set a known
uidongrafana_data_sourceand edit the downloaded JSON once so panels use that UID, or - Keep your exported JSON unchanged and use Terraform’s
replace()to substitute the UID string embedded in the file with the UID of the data source you manage (shown below).
Store the dashboard JSON in your repository (for example dashboards/nomad-15764.json) and manage Grafana resources with Terraform:
terraform {
required_providers {
grafana = {
source = "grafana/grafana"
version = "~> 3.0"
}
}
}
provider "grafana" {
url = var.grafana_url
auth = "${var.grafana_user}:${var.grafana_password}"
}
resource "grafana_data_source" "prometheus" {
uid = var.prometheus_datasource_uid
type = "prometheus"
name = "Ametnes Prometheus"
url = var.prometheus_url
basic_auth = true
basic_auth_username = var.prometheus_user
secure_json_data_encoded = jsonencode({
basicAuthPassword = var.prometheus_password
})
}
resource "grafana_dashboard" "nomad" {
depends_on = [grafana_data_source.prometheus]
config_json = replace(
file("${path.module}/dashboards/nomad-15764.json"),
var.nomad_dashboard_embedded_prometheus_uid,
grafana_data_source.prometheus.uid
)
overwrite = true
}
variable "grafana_url" {
type = string
}
variable "grafana_user" {
type = string
}
variable "grafana_password" {
type = string
sensitive = true
}
variable "prometheus_url" {
type = string
}
variable "prometheus_user" {
type = string
}
variable "prometheus_password" {
type = string
sensitive = true
}
variable "prometheus_datasource_uid" {
type = string
description = "Stable UID for the Prometheus data source in Grafana (panels in the dashboard JSON should end up referencing this value)."
default = "ametnes-prometheus"
}
variable "nomad_dashboard_embedded_prometheus_uid" {
type = string
description = "The Prometheus datasource `uid` string already present in the downloaded dashboard JSON (inspect the file; community exports often repeat the same UID in each panel)."
}
Other workloads#
For application metrics that use a standard Prometheus scrape path (not Nomad’s /v1/metrics), add another scrape_configs entry in the collector (or a separate agent) with the correct metrics_path and targets.
Validation checklist#
curlto127.0.0.1:4646showsnomad_*series locally.- Remote Ametnes Prometheus
/api/v1/queryreturns data fornomad_client_allocations_blocked. - The imported Nomad dashboard shows panels after label/job alignment.