Suspect Scoring
How BlameTrail ranks recent deploys by likelihood of causing an incident using temporal proximity and enriched commit context.
When an incident opens, BlameTrail automatically identifies which recent deploys are most likely responsible. This process, called suspect scoring, ranks deploys by their temporal proximity to the first failure and presents them with enriched context so your team can start investigating immediately.
How scoring works
Suspect scoring follows a straightforward process:
- Incident triggers — A monitor records 3 consecutive failures (availability) or 3 consecutive slow responses (latency), creating an incident.
- Window lookup — BlameTrail queries all deploys to the affected service from the last 60 minutes before the first failure.
- Proximity ranking — Deploys are ranked by how close in time they were to the incident start. The most recent deploy before the failure is scored highest.
- Context attachment — Each suspect deploy is annotated with its commit message, branch, deployer, and any enrichment data (PR details, changed files).
Understanding confidence
Each suspect deploy receives a confidence score based on temporal proximity:
- High confidence — Deployed within minutes of the first failure. This is the most common pattern: a deploy goes out and shortly after, monitors start failing.
- Medium confidence — Deployed 15-30 minutes before the failure. Still a likely candidate, especially for issues that take time to manifest (memory leaks, queue backlogs, gradual cache invalidation).
- Lower confidence — Deployed 30-60 minutes before the failure. Less likely to be the direct cause, but worth reviewing if higher-ranked suspects are ruled out.
The top suspect is highlighted prominently on the incident detail page and included in Slack notifications.
What you see for each suspect
For every suspect deploy, BlameTrail displays:
| Field | Source |
|---|---|
| Commit SHA | Deploy webhook payload |
| Commit message | Deploy webhook payload or GitHub enrichment |
| Branch | Deploy webhook payload |
| Deployed by | Deploy webhook payload |
| Time since deploy | Calculated from deploy timestamp to incident start |
| PR title and number | GitHub enrichment (if available) |
| PR author | GitHub enrichment (if available) |
| Changed files | GitHub enrichment, ranked by relevance (if available) |
If commit enrichment is configured, the suspect list becomes significantly more useful. Instead of just seeing a commit SHA, you see the full PR context and which files were changed.
AI summary integration
When an incident has an AI-generated summary, BlameTrail includes suspect deploy context in the prompt. The AI can reference:
- What code was changed in the top suspects
- Which files were modified and their relevance
- The relationship between the changed code and the type of failure observed
This produces summaries that go beyond "the service is down" to "the service started failing 3 minutes after a deploy that modified the database connection pool configuration."
Scoring without enrichment
Suspect scoring works even without GitHub enrichment. If no GitHub token is configured or the service is not linked to a repository, BlameTrail still:
- Identifies deploys in the 60-minute window
- Ranks them by temporal proximity
- Shows commit SHA, branch, deployer, and commit message from the webhook payload
Enrichment adds depth, but the core scoring mechanism only depends on deploy timestamps.
Example timeline
14:00 Deploy v2.3.0 (branch: main, by: alice)
14:12 Deploy v2.3.1 (branch: hotfix/cache, by: bob)
14:15 Monitor starts failing
14:17 Incident created (3 consecutive failures)In this scenario, BlameTrail would rank:
- v2.3.1 (highest) — Deployed 3 minutes before first failure
- v2.3.0 (lower) — Deployed 15 minutes before first failure
Both deploys appear on the incident page. If enrichment is available, you would see that v2.3.1 touched cache configuration files, immediately suggesting where to look.
Limitations
- 60-minute window — Deploys older than 60 minutes before the incident are not considered. If you suspect an older deploy caused the issue, check the deploy history manually.
- Single service scope — Scoring only considers deploys to the service that owns the failing monitor. Cross-service incidents require manual correlation.
- Temporal proximity only — The scoring algorithm does not analyze code content. It ranks by time, then relies on enrichment and AI summaries to provide code-level context.