
The incident graph: connecting AWS events to PagerDuty in seconds
Incidents rarely happen in isolation. The real signal appears when you connect telemetry from AWS with the people, customers, and runbooks involved. That’s why we built the incident graph.
How the graph works
- Edges from events. CloudWatch alarms, Lambda logs, and customer impact reports are ingested and mapped to the same nodes as PagerDuty incidents and Slack conversations.
- Memory with nuance. Vexly remembers which teams owned the last fix and which customers were impacted. It uses that memory to personalize suggested actions.
- Autonomous actions. When confidence is high, Vexly executes the fix directly. When it is low, it drafts the remediation in Slack and tags the right humans.
Why it matters
- Better MTTR without more alerts. We cut the noisy alerts and focus on correlated actions that actually move the incident forward.
- Context stays fresh. The dashboard view of the graph mirrors into Slack so everyone sees the same source of truth.
- Safer automation. Every action is auditable. The blog is where we share what worked — and what didn’t — as we ship new guardrails.
Vexly is built to be a teammate, not a script. The incident graph keeps that teammate aligned with the humans on-call.
