Tech Stack

Reference of all technologies, frameworks, and infrastructure used in the platform.

Backend

Technology	Version	Purpose
Python	3.11	Primary backend language
FastAPI	>=0.109	API framework (14 routers, async support)
Uvicorn	>=0.27	ASGI server
mysql-connector-python	>=8.2	StarRocks connectivity (MySQL-compatible protocol)
Pydantic	>=2.5	Request/response validation, settings
PyJWT	>=2.8	JWT authentication tokens
psycopg2-binary	>=2.9	Supabase (PostgreSQL) connectivity for auth/users
python-dotenv	>=1.0	Environment variable management
PyYAML	>=6.0	Health scoring rules, issue grouping rules
Flask	(alert_webhook)	Lightweight webhook receiver for Grafana alerts
Anthropic SDK	-	Claude API for Investigator, Patrol, and Chat agents
Pandas	-	Data manipulation in agent tools and scoring pipelines
Requests	-	HTTP client for Lark API and Knowledge Lake MCP

Agent Architecture

Component	Module	Description
`AgentBase`	`byoc_agent/agent_base.py`	Reusable tool-use loop; supports Anthropic and OpenAI-compatible LLM providers
`AgentTools`	`byoc_agent/agent_tools.py`	SQL-backed tools for autonomous agents (no MCP subprocess needed)
`KnowledgeLakeClient`	`byoc_agent/knowledge_lake_client.py`	HTTP+SSE client for MCP-based knowledge search
`UnifiedScorer`	`byoc_agent/unified_scorer.py`	14-dimension health scoring: 60% metrics + 25% alerts + 15% tier
`IssueTracker`	`byoc_agent/issue_tracker.py`	Alert grouping with anomaly/failure/escalation strategy

Frontend

Technology	Version	Purpose
React	19.x	UI framework
TypeScript	~5.9	Type-safe JavaScript
Vite	6.x	Build tool and dev server
Tailwind CSS	v4.2	Utility-first CSS framework
React Router	7.x	Client-side routing (12 pages)
TanStack React Query	5.x	Server state management and caching
Lucide React	0.577	Icon library
React Markdown	10.x	Rendering agent/patrol report markdown
Looker Embed SDK	2.x	Embedded Looker dashboards with SSO
Axios	1.x	HTTP client for API calls
class-variance-authority	0.7	Component variant management (shadcn/ui pattern)

Frontend Pages

12 pages under frontend/src/pages/:

Overview.tsx        -- Fleet health dashboard
Issues.tsx          -- Issue tracker with triage workflow
RawAlerts.tsx       -- Raw lark_alerts table view
BreakingPoint.tsx   -- Capacity and risk forecasting
UsagePatterns.tsx   -- Cluster utilization trends
Investigations.tsx  -- Agent investigation reports
AIIssues.tsx        -- AI-grouped issue summaries
Patrol.tsx          -- Fleet patrol reports
Chat.tsx            -- Interactive BYOCAgent chat
LLMUsage.tsx        -- Token and cost tracking
Settings.tsx        -- User preferences
Help.tsx            -- Platform documentation

Database

System	Purpose	Access Method
StarRocks	Primary OLAP store (3 databases: `metrics`, `byoc`, `alerts`)	`mysql-connector-python` over MySQL protocol on port 9030
Supabase	User authentication and management	`psycopg2-binary` (PostgreSQL) + Supabase Auth API

StarRocks Connection

Production (EC2): Direct connection to 1cogri9tn-internal.cloud-app.celerdata.com:9030 (same VPC)
Local development: SSH tunnel through bastion (100.100.118.18 via Tailscale) to 127.0.0.1:9030
Important: The mysql CLI does not work (protocol mismatch). Always use Python mysql.connector.

Infrastructure

Component	Service	Purpose
Compute	AWS EC2	Bastion host, cron jobs (daily pipeline, Sentinel/Investigator every 15 min, Patrol 2x/day)
Serverless	AWS Lambda	Bastion wake-up function
IaC	AWS CloudFormation	Infrastructure provisioning
Monitoring	AWS CloudWatch	Infrastructure-level logging and alarms
Containers	Docker	Application packaging (`python:3.11-slim` base image)
Networking	Tailscale VPN	Secure access to bastion host for local development
CI/CD	GitHub Actions	Auto-deploy on push to `main` (paths: `byoc_agent/**`, `Dockerfile`, `pyproject.toml`)

Deployment Pipeline

The GitHub Actions workflow (.github/workflows/deploy.yml):

Triggers on push to main for relevant paths
Calls Lambda to wake the EC2 bastion (waits 75s for boot)
SSHs into bastion and deploys updated code

External Integrations

System	Integration	Purpose
Lark	Open API (`im_v1_message_list`)	Alert channel message fetching
Grafana	Webhook contact point + dashboards	Alert delivery and cluster dashboard links
Looker	Embed SDK + LookML	Embedded analytics dashboards
MCP (Model Context Protocol)	Streamable HTTP	Knowledge Lake search (vector + fulltext), Lark API tools
Claude API	Anthropic SDK	LLM backbone for Investigator, Patrol, and Chat agents

Configuration

Key configuration files:

File	Purpose
`.env`	StarRocks credentials, Lark API keys, LLM provider config
`byoc_agent/health_rules.yaml`	Scoring thresholds per metric dimension
`byoc_agent/issue_rules.yaml`	Alert grouping rules and severity mappings
`backend/config.py`	FastAPI settings (Pydantic Settings)
`Dockerfile`	Container image definition
`entrypoint.sh`	Container startup script
`.github/workflows/deploy.yml`	CI/CD pipeline definition

Backend​

Agent Architecture​

Frontend​

Frontend Pages​

Database​

StarRocks Connection​

Infrastructure​

Deployment Pipeline​

External Integrations​

Configuration​