Monitors

Configure HTTP health checks to detect outages and latency issues automatically.

A monitor is an HTTP health check that BlameTrail runs on a recurring schedule. Each monitor belongs to a service and watches a single endpoint, recording the response status code, response time, and whether the check passed or failed.

Creating a monitor

Navigate to Monitors and click Add Monitor.
Configure the following fields:

Field	Required	Description
Service	Yes	The service this monitor belongs to.
URL	Yes	The endpoint to check (e.g., `https://api.example.com/health`).
HTTP Method	Yes	The request method — `GET`, `POST`, `HEAD`, `PUT`, `PATCH`, or `DELETE`.
Expected Status	Yes	The HTTP status code that counts as a passing check (e.g., `200`).
Interval	Yes	How often to run the check, in seconds.
Latency Threshold	Yes	Maximum acceptable response time in milliseconds. Responses slower than this are recorded as latency failures.
Headers	No	Custom HTTP headers to include with each request (e.g., authentication tokens).

Click Save. The monitor starts checking immediately.

How checks work

A background worker pings the configured URL at the specified interval. For each check, BlameTrail records:

Status code — The HTTP response code returned by the endpoint.
Latency — The response time in milliseconds.
Result — Pass or fail, based on whether the status code matches the expected value and the latency is within the threshold.

A check fails if:

The endpoint returns a status code that does not match the expected value (availability failure).
The response time exceeds the latency threshold (latency failure).
The endpoint is unreachable or the connection times out (availability failure).

After 3 consecutive failures of the same type, BlameTrail automatically creates an incident. See Incidents for details on incident creation and resolution.

Pausing a monitor

You can pause a monitor by setting it to inactive. This stops all scheduled checks without deleting the monitor or its history. To resume, set the monitor back to active.

Pausing is useful during planned maintenance windows or when an endpoint is intentionally offline.

Monitor detail page

The monitor detail page shows:

Current status — Whether the monitor is passing or failing.
Check history — A timeline of recent checks with status codes, latency values, and pass/fail results.
Active incidents — Any open incidents linked to this monitor.

Use the check history to identify patterns — intermittent failures, gradual latency increases, or specific time windows where problems occur.

Creating a monitor

How checks work

Pausing a monitor

Monitor detail page

On this page