LocalAnalyst -- LLM-Free Analysis Engine

Source: byoc_agent/local_analyst.py

The LocalAnalyst runs pre-built analyses without needing an LLM API. Each quick-action button in the UI maps to a function that queries live StarRocks data and returns a formatted markdown report.

Available Analyses

Cluster Health Overview

Function: cluster_health_overview()

Comprehensive health overview of all clusters over the last 7 days. Queries compaction scores (gauge, from MV), query latency (histogram, from v6 view), error counts, and total queries.

Output includes:

Summary counts: X clusters monitored -- Y Healthy, Z Warning, W Critical
Per-cluster table with: health status, avg/max compaction, avg latency, errors (with rate), total queries
Recommendations for Warning and Critical clusters

Health classification:

Critical: max compaction > 5,000 or avg latency > 1,000ms
Warning: max compaction > 2,000 or avg latency > 200ms
Healthy: below both thresholds

Query Latency Trends

Function: query_latency_trends()

Analyzes query latency trends over 7 days using window functions (FIRST_VALUE / LAST_VALUE) to compute percent change from start to end of the period.

Output includes:

Table with cluster_id, first_val, last_val, pct_change
Interpretation guidelines (positive = worsening, >20% = flag for review)

Clusters Near Breaking Point

Function: clusters_near_breaking_point()

Identifies clusters approaching their limits across two dimensions:

Resource thresholds -- Clusters with max compaction score > 1,000, JVM heap, or BE process memory exceeding limits (gauge metrics from MV).
High latency hours -- Clusters with query latency > 200ms (histogram from v6 view), counted as hours.

Output includes:

Resources exceeding thresholds table
Hours with high latency table
Action items for compaction backlog, JVM heap, and latency patterns

Alert Summary

Function: alert_summary()

Notification-style summary from real Lark alerts.

Output includes:

Recent Lark alerts (7d) grouped by alert_name and status
Top firing alerts by cluster
Error spikes from metrics (days with >100 errors)

Usage Patterns

Function: usage_patterns()

Analyzes usage patterns across the fleet.

Output includes:

Cluster activity ranking (total queries, avg QPS hourly)
Peak usage hours (UTC)
Low-activity clusters (avg concurrent queries < 5) -- candidates for downsizing

Data Source Strategy

The LocalAnalyst uses two different data sources based on metric type:

Metric Type	Source	Reason
Gauge (point-in-time)	`metrics.amv_hourly_snapshots_v1`	MAX/AVG are meaningful for snapshot values
Counter/Histogram (deltas)	`metrics.metrics_hourly_view_v6`	Provides pre-computed hourly deltas

Function Registry

The ANALYSIS_FUNCTIONS dict maps UI button labels to functions:

ANALYSIS_FUNCTIONS = {
    "Cluster Health Overview": cluster_health_overview,
    "Query Latency Trends": query_latency_trends,
    "Clusters Near Breaking Point": clusters_near_breaking_point,
    "Alert Summary": alert_summary,
    "Usage Patterns": usage_patterns,
}

Usage

from byoc_agent.local_analyst import ANALYSIS_FUNCTIONS

# Run a specific analysis
report_md = ANALYSIS_FUNCTIONS["Cluster Health Overview"]()
print(report_md)

# Or call directly
from byoc_agent.local_analyst import cluster_health_overview
report = cluster_health_overview()

Available Analyses​

Cluster Health Overview​

Query Latency Trends​

Clusters Near Breaking Point​

Alert Summary​

Usage Patterns​

Data Source Strategy​

Function Registry​

Usage​

Available Analyses

Cluster Health Overview

Query Latency Trends

Clusters Near Breaking Point

Alert Summary

Usage Patterns

Data Source Strategy

Function Registry

Usage