Lark MCP Ingestion
Interactive alert ingestion via the Lark MCP tool. Used for ad-hoc fetches and historical backfills.
Flow
Lark MCP (im_v1_message_list)
→ JSON saved to file
→ process_lark_page.py
→ parse_alert_card() per message
→ insert_rows() → alerts.lark_alerts
Lark Channel
- Chat ID:
oc_392593bb5ace3f00ee10ab53bfe7681f(BYOC Online Alarm) - MCP parameters:
container_id_type: "chat",sort_type: "ByCreateTimeDesc",page_size: 50 - Results are often >100KB, so they get saved to files for batch processing.
Key Files
load_lark_alerts.py
Core parsing and insert logic. Two main functions:
parse_alert_card(msg) -> list[dict]
Parses a single Lark interactive card message into one alert row. Handles:
- Title parsing: Extracts cluster name and admin email from
"Cluster: <name> (<email>)"format. Detects RESOLVED cards from title suffix. - Field extraction: Regex-based extraction of Region, Account, and Cluster ID (UUID) from concatenated card text.
- Link extraction: Pulls Admin, Dashboard, and Silence URLs from
<a>tags in card elements. - Multi-alert cards: A single card can contain both Firing and Resolved sections. These are merged into one row with pipe-delimited
alert_status(e.g.,"Firing|Resolved") andalert_namefields. - Silence artifact filtering: Skips fake alerts from "Create Silence" button blocks via
_is_silence_artifact(). - Detail truncation:
alert_detailcapped at 2,048 characters;raw_contentat 65,000.
insert_rows(cursor, rows) -> int
Batch INSERT into alerts.lark_alerts. Returns the count of inserted rows.
process_lark_page.py
Batch processor that reads a saved MCP result file (JSON), iterates over all messages, calls parse_alert_card() for each, and inserts via insert_rows().
Usage:
.venv/bin/python3 process_lark_page.py <filepath>
Output format:
msgs=50 alerts=42 range=2026-01-19 00:00:00+00:00 to 2026-01-20 00:00:00+00:00 has_more=True
PAGE_TOKEN=xxx
Or when done:
msgs=12 alerts=10 range=... has_more=False
DONE=true
Timestamp Handling
All timestamps are converted at parse time:
create_timefrom Lark (epoch ms) is converted to both Pacific (created_at) and UTC (created_at_utc).- The
ZoneInfo("America/Los_Angeles")handles DST transitions automatically.