BlameTrail

Welcome to BlameTrail

Learn how to monitor your services, track deploys, and pinpoint the code that broke production.

BlameTrail is an incident monitoring platform that detects outages, correlates them with recent deploys, and uses AI to explain what went wrong. This documentation covers everything you need to get started and make the most of the platform.

Where to start

Feature overview

FeatureWhat it does
Uptime monitoringHTTP health checks with configurable intervals and thresholds
Incident detectionAutomatic incident creation after 3 consecutive failures
Deploy trackingRecord every deploy via webhook, enrich with GitHub metadata
Suspect scoringRank recent deploys by how likely they caused an incident
Commit analysisAI-powered inspection of code changes — file classification, risk scoring, diagnosis
Fix proposalsGenerate revert PRs or AI-powered code fixes for incidents with suspect commits
On-call pagingPage the rotation over SMS, voice (with DTMF acknowledge), and browser push, with per-tenant guardrails
AI postmortemsAuto-draft postmortems on resolve — timeline, customer impact, root cause, action items
Public status pagesCustomer-facing status at status.yourdomain.com with timestamped incident updates
IntegrationsGitHub, Sentry, Slack, Grafana, and custom webhooks
Team managementOrganizations, roles, and member invitations

On this page