Data Insights Overview

Ask natural-language questions about your data and get instant SQL-generated answers, visualizations, and insights. Data Insights coordinates a team of AI agents that generate SQL, execute queries, analyze results, and produce visualizations — all in real time via streaming.

Data Insights is built on the A2A (Agent-to-Agent) protocol, an open standard for inter-agent communication using JSON-RPC 2.0 over Server-Sent Events (SSE). Each agent is independently deployable and discoverable via Agent Cards.

Who Is It For

Business Users & Knowledge Workers

Ask questions about your data in plain language and get instant answers, visualizations, and reports — no SQL or technical skills required.

Data Analysts

Explore data through multi-turn conversations, generate visualizations, and export results. The AI handles SQL generation so you can focus on analysis.

Business Leaders

Get AI-powered insights from enterprise data to support strategic decisions. Generate on-demand reports and discover trends through natural-language queries.

Data Engineers

Validate SQL generation, review query plans, and configure data connections. Monitor agent performance and query execution through observability tooling.

How It Works

User Asks a Question

A user types a natural-language question in the chat interface, such as “What were the top 10 products by revenue last quarter?”

Talk2Data Service Routes the Request

The Talk2Data Service (REST + SSE gateway) creates a session, routes the message to the Insights Agent, and establishes an SSE stream back to the client.

Insights Agent Orchestrates

The Insights Agent runs an agentic loop: it reasons about the question, selects tools, and delegates SQL generation to the Text2SQL Agent.

Text2SQL Agent Generates and Validates SQL

The Text2SQL Agent analyzes the database schema, generates SQL using an LLM, validates it with sqlglot, and executes the query against the user’s data connection.

Results Stream Back

Query results, analysis, and visualizations stream back to the user in real time via A2A events (TaskStatusUpdateEvent for progress, TaskArtifactUpdateEvent for results).

Architecture

Building a similar solution? Use the Solution Developer Guide › Starter Templates — the Tier 3 template mirrors this layered agent shape with copy-pasteable scaffolds.

The solution follows a layered agent architecture:

Client / React UI
    |
Talk2Data Service (REST + SSE gateway, port 8080)
    |
Insights Agent (agentic loop, reasoning, tool orchestration, port 8002)
    |--- Text2SQL Agent (NL-to-SQL generation, validation, execution, port 8001)
    |--- Coding Agent (LLM-generated Python code execution, port 8004)
    |--- MCP Plotly Server (visualization tools via MCP, port 8000)

Text2SQL and Coding agents communicate with the Insights Agent via A2A, while the MCP Plotly Server is accessed via MCP. The Insights Agent dynamically discovers both sub-agents and MCP tools on each request.

A2A Protocol Integration

All inter-agent communication uses the A2A protocol:

Agent Cards

Each agent publishes a JSON manifest at /.well-known/agent-card.json that describes its identity, skills, and capabilities. Agent Cards enable automatic discovery by the Talk2Data service and Insights Agent at request time.Key fields include:

name — human-readable agent name
skills — list of capabilities with input/output schemas
capabilities — supported features (streaming, multi-turn, etc.)
endpoint — the agent’s A2A service URL

Message Parts

A2A messages contain typed parts:

Part Type	Usage
`TextPart`	Natural-language text (questions, analysis, explanations)
`DataPart`	Structured data (datasource configs, query parameters)
`FilePart`	Binary artifacts (charts, exported files)

Events

Agents emit events for real-time frontend updates:

Event	Purpose
`TaskStatusUpdateEvent`	Progress messages (“Analyzing schema…”, “Generating SQL…”)
`TaskArtifactUpdateEvent`	Final results (data tables, charts, text analysis)

Every pipeline step emits a status event before starting work, providing real-time visibility into the agent’s reasoning process.

Context ID

The A2A context_id maps to the session ID for multi-turn conversation state. This enables agents to maintain context across multiple questions in the same conversation.

Pipeline Framework

Agent workflows are built on the commons.pipeline.Pipeline state-machine framework:

Steps are named functions that perform a unit of work
Each step returns a Transition object with a goto target (next step, break, or error)
The pipeline supports cooperative cancellation for graceful shutdown
Unexpected exceptions are wrapped in StepError for structured error handling

The Text2SQL agent uses this framework for its generate-validate-execute flow:

generate_sql -> validate_sql -> execute_query -> format_results

LLM Integration

Data Insights uses LiteLLM for provider-agnostic LLM access:

Feature	Details
Client	`commons.llm.LLMClient` wrapping LiteLLM
Model format	`provider/model` (e.g., `vertex_ai/gemini-3.5-flash`, `vertex_ai/claude-opus-4-8`, `openai/gpt-5.5`)
Observability	Langfuse LLM tracing auto-enabled when `LANGFUSE_HOST` is set. Chat dispatch and the pipeline executor are auto-instrumented with `@observe` decorators, A2A trace context propagates across services so a single conversation produces one unified trace, trace IDs are seeded deterministically from `turn_id`, and per-iteration spans are emitted inside agent loops. See Langfuse Setup for configuration and Langfuse Overview for the full tracing model.
Configuration	Per-service LLM env vars (`TALK2DATA_TEXT2SQL_LLM_MODEL`, `TALK2DATA_INSIGHTS_LLM_MODEL`, etc.) via `pydantic_settings.BaseSettings`

Credentials are never hardcoded. All API keys and connection strings are loaded from environment variables or .env files via pydantic_settings.BaseSettings.

Database Schema

The talk2data database schema stores conversation state and artifacts:

Table	Purpose
sessions	Chat sessions with user and project context
conversation_messages	Individual messages within a session
artifacts	Generated outputs (SQL queries, results, visualizations)
feedback	User feedback on agent responses

Primary keys: UUID strings
Timestamps: DateTime(timezone=True) with UTC
ORM: SQLAlchemy 2.0+ async with asyncpg driver
Migrations: Alembic in packages/common-db/

REST Endpoints

The Talk2Data Service exposes the following REST and SSE endpoints:

Endpoint	Purpose
`GET /talk2data/chat/sessions`	List active chat sessions
`POST /talk2data/chat/*`	Start or continue a chat session (SSE streaming)
`POST /talk2data/v1/sample`	Preview rows from a data connection table (no LLM involved)

The /talk2data/v1/sample endpoint is useful for data exploration before composing a question — it fetches a configurable number of rows from a named table via an existing data connection, using a fully qualified table name (database.schema.table). See Text-to-SQL for request and error details.

Platform Integration

Data Insights integrates with the platform layer for:

Capability	Platform Service
Authentication	Governance (JWT validation via the platform identity provider)
Authorization	Governance (permission checks via the authorization service SDK)

| Data connections | Assets (configured database connections) | | Asset management | Assets (data connections, artifacts, files, models) | The integration uses auto-generated Python SDKs from the platform’s OpenAPI specs. Data Insights never implements its own permission checks.

Next Steps

Chat with Data

Learn how to use the conversational interface for data analysis.

Text-to-SQL

Understand how natural-language questions are converted to SQL queries.

Agent Registry

See how Data Insights agents are registered and discovered.

Data Source Setup

Configure data connections for your databases.

​Data Insights Overview

​Who Is It For

Business Users & Knowledge Workers

Data Analysts

Business Leaders

Data Engineers

​How It Works

​Architecture

​A2A Protocol Integration

​Pipeline Framework

​LLM Integration

​Database Schema

​REST Endpoints

​Platform Integration

​Next Steps

Chat with Data

Text-to-SQL

Agent Registry

Data Source Setup

Data Insights Overview

Who Is It For

How It Works

Architecture

A2A Protocol Integration

Pipeline Framework

LLM Integration

Database Schema

REST Endpoints

Platform Integration

Next Steps