Building an AI Agent for Graph Construction

Helix AI > Tech Blog > Building an AI Agent for Graph Construction

Building an AI Agent for Graph Construction

This technical deep-dive explores how Helix constructs actionable business graphs from natural language queries - the architecture, decision-making process, and techniques that make it work reliably.

The Challenge

Turning a question like "Show me sales trends by region" into an effective visualization requires multiple decisions:

Which data source(s) contain the relevant information?
What SQL query or aggregation is needed?
Which chart type best represents the answer?
What transformations make the data visualization-ready?
What labels, titles, and formatting convey the insight clearly?

Helix makes these decisions through a multi-stage agent pipeline.

Architecture Overview

The graph construction pipeline consists of several cooperating components:

User Query
    |
    v
[Query Understanding]  -- Intent classification, entity extraction
    |
    v
[Dataset Selection]    -- Schema matching, relevance scoring
    |
    v
[Query Generation]     -- SQL/aggregation synthesis
    |
    v
[Data Retrieval]       -- Execute queries, handle errors
    |
    v
[Graph Type Detection] -- Match data shape to visualization
    |
    v
[Graph Construction]   -- Plotly/D3 specification generation
    |
    v
Rendered Visualization

Query Understanding

The first stage extracts structured information from natural language:

Intent Classification: Is this a comparison, trend, distribution, or composition question?
Entity Extraction: Which columns, tables, time ranges, filters are referenced?
Semantic Parsing: What aggregations (sum, average, count) are implied?

We use a combination of fine-tuned prompts and few-shot examples to guide the LLM:

query_analysis_prompt = """
Analyze this business question and extract:
- metrics: what should be measured
- dimensions: how to group/segment
- filters: any constraints mentioned
- time_range: temporal bounds if specified
- comparison_type: trend/comparison/distribution/composition

Question: {user_query}
Available tables: {schema_summary}
"""

Dataset Selection

Given extracted entities, we score candidate datasets by:

Column name matching: Fuzzy match between query terms and schema
Semantic similarity: Embedding-based comparison of query intent vs table descriptions
Usage history: Tables frequently used together get boost scores

This stage also handles joins when data spans multiple tables, using foreign key relationships and historical query patterns.

Query Generation

The core SQL generation uses constrained decoding with schema awareness:

def generate_query(analysis, schema):
    # Build prompt with full schema context
    prompt = build_query_prompt(analysis, schema)

    # Generate with temperature=0 for determinism
    sql = llm.generate(prompt, temperature=0)

    # Validate against schema
    validated = validate_sql(sql, schema)

    # Add safety limits
    return add_row_limits(validated)

Key techniques for reliable SQL generation:

Schema-in-context: Full table definitions in every prompt
Example-driven: Similar query patterns as few-shot examples
Validation loop: Parse and verify before execution
Error recovery: LLM-assisted debugging on failures

Graph Type Detection

After data retrieval, we analyze the result shape to select visualization type:

def detect_graph_type(df, query_intent):
    features = {
        'num_rows': len(df),
        'num_columns': len(df.columns),
        'has_datetime': any(is_datetime(c) for c in df.columns),
        'has_categories': count_categorical(df),
        'has_numeric': count_numeric(df),
        'cardinality': estimate_cardinality(df),
    }

    # Rule-based selection with ML fallback
    if query_intent == 'trend' and features['has_datetime']:
        return 'line'
    elif query_intent == 'comparison' and features['has_categories']:
        return 'bar'
    # ... additional rules

    # ML classifier for ambiguous cases
    return ml_classifier.predict(features)

Graph Construction

Finally, we generate the visualization specification. For Plotly charts:

def build_plotly_spec(df, graph_type, metadata):
    spec = {
        'data': build_traces(df, graph_type),
        'layout': {
            'title': generate_title(metadata),
            'xaxis': {'title': metadata['x_label']},
            'yaxis': {'title': metadata['y_label']},
            # Auto-formatting based on data types
        }
    }

    # Apply styling and accessibility
    spec = apply_theme(spec)
    spec = ensure_accessible_colors(spec)

    return spec

Reliability Engineering

Making this pipeline production-ready requires:

Caching and Memoization

Query analysis, schema lookups, and common visualizations are cached to reduce latency and cost.

Graceful Degradation

When SQL fails, we attempt simpler queries. When graph detection is uncertain, we fall back to safe defaults (tables, basic bars).

Observability

Every pipeline stage emits structured logs for debugging. LLM judge evaluates output quality on a sample of requests.

Rate Limiting and Quotas

Protection against expensive queries through row limits, execution timeouts, and per-user quotas.

Results

This architecture achieves:

~90% first-attempt success rate on well-formed queries
Sub-3-second latency for cached schema lookups
Graceful handling of ambiguous or malformed requests
Consistent visualization quality through automated validation

For more details on specific components, see our other tech blog posts on query generation and LLM evaluation.