Building an AI Agent for Graph Construction
This technical deep-dive explores how Helix constructs actionable business graphs from natural language queries - the architecture, decision-making process, and techniques that make it work reliably.
The Challenge
Turning a question like "Show me sales trends by region" into an effective visualization requires multiple decisions:
- Which data source(s) contain the relevant information?
- What SQL query or aggregation is needed?
- Which chart type best represents the answer?
- What transformations make the data visualization-ready?
- What labels, titles, and formatting convey the insight clearly?
Helix makes these decisions through a multi-stage agent pipeline.
Architecture Overview
The graph construction pipeline consists of several cooperating components:
User Query
|
v
[Query Understanding] -- Intent classification, entity extraction
|
v
[Dataset Selection] -- Schema matching, relevance scoring
|
v
[Query Generation] -- SQL/aggregation synthesis
|
v
[Data Retrieval] -- Execute queries, handle errors
|
v
[Graph Type Detection] -- Match data shape to visualization
|
v
[Graph Construction] -- Plotly/D3 specification generation
|
v
Rendered Visualization
Query Understanding
The first stage extracts structured information from natural language:
- Intent Classification: Is this a comparison, trend, distribution, or composition question?
- Entity Extraction: Which columns, tables, time ranges, filters are referenced?
- Semantic Parsing: What aggregations (sum, average, count) are implied?
We use a combination of fine-tuned prompts and few-shot examples to guide the LLM:
query_analysis_prompt = """
Analyze this business question and extract:
- metrics: what should be measured
- dimensions: how to group/segment
- filters: any constraints mentioned
- time_range: temporal bounds if specified
- comparison_type: trend/comparison/distribution/composition
Question: {user_query}
Available tables: {schema_summary}
"""
Dataset Selection
Given extracted entities, we score candidate datasets by:
- Column name matching: Fuzzy match between query terms and schema
- Semantic similarity: Embedding-based comparison of query intent vs table descriptions
- Usage history: Tables frequently used together get boost scores
This stage also handles joins when data spans multiple tables, using foreign key relationships and historical query patterns.
Query Generation
The core SQL generation uses constrained decoding with schema awareness:
def generate_query(analysis, schema):
# Build prompt with full schema context
prompt = build_query_prompt(analysis, schema)
# Generate with temperature=0 for determinism
sql = llm.generate(prompt, temperature=0)
# Validate against schema
validated = validate_sql(sql, schema)
# Add safety limits
return add_row_limits(validated)
Key techniques for reliable SQL generation:
- Schema-in-context: Full table definitions in every prompt
- Example-driven: Similar query patterns as few-shot examples
- Validation loop: Parse and verify before execution
- Error recovery: LLM-assisted debugging on failures
Graph Type Detection
After data retrieval, we analyze the result shape to select visualization type:
def detect_graph_type(df, query_intent):
features = {
'num_rows': len(df),
'num_columns': len(df.columns),
'has_datetime': any(is_datetime(c) for c in df.columns),
'has_categories': count_categorical(df),
'has_numeric': count_numeric(df),
'cardinality': estimate_cardinality(df),
}
# Rule-based selection with ML fallback
if query_intent == 'trend' and features['has_datetime']:
return 'line'
elif query_intent == 'comparison' and features['has_categories']:
return 'bar'
# ... additional rules
# ML classifier for ambiguous cases
return ml_classifier.predict(features)
Graph Construction
Finally, we generate the visualization specification. For Plotly charts:
def build_plotly_spec(df, graph_type, metadata):
spec = {
'data': build_traces(df, graph_type),
'layout': {
'title': generate_title(metadata),
'xaxis': {'title': metadata['x_label']},
'yaxis': {'title': metadata['y_label']},
# Auto-formatting based on data types
}
}
# Apply styling and accessibility
spec = apply_theme(spec)
spec = ensure_accessible_colors(spec)
return spec
Reliability Engineering
Making this pipeline production-ready requires:
Caching and Memoization
Query analysis, schema lookups, and common visualizations are cached to reduce latency and cost.
Graceful Degradation
When SQL fails, we attempt simpler queries. When graph detection is uncertain, we fall back to safe defaults (tables, basic bars).
Observability
Every pipeline stage emits structured logs for debugging. LLM judge evaluates output quality on a sample of requests.
Rate Limiting and Quotas
Protection against expensive queries through row limits, execution timeouts, and per-user quotas.
Results
This architecture achieves:
- ~90% first-attempt success rate on well-formed queries
- Sub-3-second latency for cached schema lookups
- Graceful handling of ambiguous or malformed requests
- Consistent visualization quality through automated validation
For more details on specific components, see our other tech blog posts on query generation and LLM evaluation.