Alert Pipeline Overview
Alerts from BYOC clusters flow through three ingestion paths, all converging into the alerts.lark_alerts table in StarRocks.
Architecture
Ingestion Paths
1. Lark MCP (interactive / ad-hoc)
Used during development and manual backfills. Claude calls the Lark MCP tool (im_v1_message_list) to fetch messages, saves results to a file, then process_lark_page.py parses and inserts them.
Key files: load_lark_alerts.py, process_lark_page.py
2. Grafana Webhook (real-time)
A Flask server on port 5050 receives Grafana/Alertmanager webhook POSTs. Each alert gets a deterministic wh_-prefixed message ID for deduplication.
Key file: alert_webhook.py
3. Daily Pipeline (scheduled)
A cron job that fetches yesterday's alerts directly from the Lark API (no MCP dependency), parses them, inserts into StarRocks, and generates daily recommendations.
Key file: daily_alert_pipeline.py
Deduplication
Each path uses a different message ID scheme to avoid collisions:
| Path | ID Prefix | Strategy |
|---|---|---|
| Lark MCP | om_ | Lark's native message ID |
| Grafana Webhook | wh_ | SHA256(alertname + cluster_id + status + startsAt) |
| Daily Pipeline | om_ + -f/-r suffix | Lark message ID with status suffix for multi-alert cards |
The lark_alerts table uses DUPLICATE KEY(message_id, created_at), so re-inserting the same row is a no-op.