AI & Strategy·14 min read

How to Build an AI Data Analyst with Text-to-SQL

Your business team asks the same question every week: 'Can someone pull me the numbers on X?' Text-to-SQL lets them ask a database directly in plain English. Here's how to build one that's accurate enough to trust.

N

Nate Laquis

Founder & CEO ·

The Problem Text-to-SQL Solves

Every company has the same bottleneck. Business teams need data to make decisions, but the data lives in databases that require SQL to query. So they file a request with the data team, wait 2 to 5 days, get a result that doesn't quite answer their question, clarify, wait again, and eventually make a decision based on stale information.

Text-to-SQL eliminates this loop. A sales manager types "What were our top 10 customers by revenue last quarter?" and gets an answer in seconds. A marketing lead asks "How many users signed up from organic search in March and converted to paid within 30 days?" and gets a precise number instead of a rough estimate.

The technology works by using an LLM to translate natural language questions into SQL queries, executing those queries against your database, and presenting the results in a readable format. It sounds simple. The implementation details are where things get tricky.

Business analytics dashboard showing data visualizations and query results from an AI data analyst

Text-to-SQL Architecture

A production text-to-SQL system has five core components:

1. Schema Understanding Layer

The LLM needs to know your database structure: table names, column names, data types, relationships, and what each column actually contains. You provide this as context in the prompt. For small databases (under 20 tables), you can include the full schema. For larger databases, you need a retrieval step that finds the most relevant tables for each question.

2. Query Generation

The LLM takes the user's question plus the schema context and generates a SQL query. Claude and GPT-4 both handle standard SQL well. The challenge is database-specific syntax (PostgreSQL vs MySQL vs BigQuery) and complex queries involving multiple joins, subqueries, or window functions.

3. Query Validation

Before executing any generated query, validate it. Parse the SQL to check syntax. Verify that referenced tables and columns exist. Check that the query is read-only (no INSERT, UPDATE, DELETE, DROP). Apply a timeout limit to prevent runaway queries that scan entire tables.

4. Execution Engine

Run the validated query against a read-only replica of your database. Never query your production database directly. Use a connection pool with strict timeout limits (10 to 30 seconds). Return results as structured data.

5. Response Formatting

Convert raw query results into a human-readable response. For simple counts or sums, return a plain text answer. For multi-row results, generate a formatted table. For time-series data, generate a chart. The LLM can also summarize the results in natural language: "Revenue from your top 10 customers was $2.3M last quarter, up 15% from Q2."

Schema Context: The Make-or-Break Factor

The quality of your text-to-SQL system depends almost entirely on how well the LLM understands your schema. Raw DDL (CREATE TABLE statements) is a starting point but not sufficient.

What to Include in Schema Context

  • Table and column descriptions. "users.mrr" doesn't tell the LLM anything. "users.mrr: Monthly recurring revenue in USD cents, updated nightly" does. Add plain-English descriptions for every table and column that business users might ask about.
  • Common join patterns. Show the LLM how tables relate. "orders.user_id references users.id" is basic. "To get customer revenue, join orders to users on user_id and sum orders.amount_cents / 100" is actionable.
  • Business logic definitions. "Active user" might mean "logged in within 30 days" in your system. "Churn" might mean "subscription cancelled or payment failed for 60+ days." Document these definitions so the LLM generates queries that match your team's understanding.
  • Example queries. Include 5 to 10 example question/SQL pairs that represent common queries. Few-shot examples dramatically improve accuracy for your specific schema.
  • Enum values and categories. If a status column contains "active," "trialing," "cancelled," and "paused," list these values. Otherwise the LLM might generate WHERE status = 'Active' (wrong case) or WHERE status = 'churned' (non-existent value).

Building good schema documentation takes 2 to 5 days for a typical SaaS database. It's the highest-ROI investment in the entire project.

Database schema diagram showing table relationships for business intelligence queries

Handling Ambiguity and Edge Cases

Natural language is inherently ambiguous. "How many new users did we get?" could mean users who registered, users who made their first purchase, or users who activated their account. Your system needs strategies for handling this.

Clarification Prompts

When a question is ambiguous, have the system ask for clarification rather than guessing. "By 'new users,' do you mean users who registered, or users who completed onboarding?" This adds a round trip but prevents wrong answers that erode trust.

Default Interpretations

For common ambiguities, define defaults in your system prompt. "'Users' refers to the users table. 'New users' means users.created_at within the specified time period. 'Revenue' means sum of payments.amount_cents / 100 where payments.status = 'succeeded'." Document these defaults so business users know what the system assumes.

Time Range Handling

"Last month" seems straightforward until you realize it could mean the last 30 days, the previous calendar month, or the last billing period. Define your default time interpretation and make the system explicit about what range it's using: "Showing data for March 1 to March 31, 2026."

Complex Questions

Text-to-SQL works well for straightforward queries: aggregations, filters, simple joins, and time-based comparisons. It struggles with multi-step analysis like "Find customers whose usage dropped by more than 50% month-over-month for three consecutive months and flag them as churn risks." For complex questions, break them into multiple simpler queries or build pre-computed views that simplify the SQL the LLM needs to generate.

Query Safety and Access Control

Letting an LLM generate and execute SQL against your database is inherently risky. Here's how to make it safe:

Read-Only Access

The database user for text-to-SQL should have SELECT-only permissions. No INSERT, UPDATE, DELETE, CREATE, or DROP. Period. Use a read replica so even a pathological query can't impact production performance.

Query Allowlisting

Parse every generated query before execution. Reject anything containing DDL statements, stored procedure calls, or system table access. Use a SQL parser (like sqlparse for Python) rather than regex to catch edge cases.

Row Limits and Timeouts

Automatically append LIMIT 1000 to queries that don't have a limit. Set a 15 to 30 second query timeout. Both prevent accidental full-table scans that could slow down your replica and return unusably large result sets.

Data Access Boundaries

Not everyone should see all data. The CEO can query revenue numbers, but a marketing intern probably shouldn't access individual customer payment details. Implement row-level and column-level access controls. The simplest approach: maintain different schema contexts for different user roles, exposing only the tables and columns each role should access.

Audit Logging

Log every question asked, every SQL query generated, and every result returned. This gives you an audit trail for compliance, a training dataset for improving accuracy, and visibility into what your team actually needs from data.

Tools, Frameworks, and Build Options

You have several options for implementing text-to-SQL, from off-the-shelf tools to custom builds:

Custom Build with LLM APIs

Build your own pipeline using Claude or GPT-4 for query generation, a SQL parser for validation, and your preferred backend framework for orchestration. This gives maximum control over the schema context, safety rules, and user experience. Development time: 4 to 8 weeks. Cost: $20,000 to $60,000.

LangChain SQL Agent

LangChain provides a pre-built SQL agent that handles schema introspection, query generation, and error recovery. It works out of the box for simple setups but needs customization for production use (better schema context, safety rules, access control). Good starting point if you want to prototype quickly. Development time: 2 to 4 weeks for production-ready.

Commercial Tools

Products like Defog, Vanna.ai, and Databricks AI/BI offer hosted text-to-SQL solutions. These handle schema management, query optimization, and access control out of the box. Pricing ranges from $500 to $5,000/month depending on query volume and features. Best for teams that want results fast without building infrastructure.

BI Tool Integrations

Tools like Metabase, Thoughtspot, and Looker are adding natural language query features. If you already use a BI platform, check whether their built-in AI features are sufficient before building custom. The accuracy is improving rapidly.

Developer building a text-to-SQL interface with code editor and database management tools

Accuracy, Costs, and Getting Started

Let's set realistic expectations about text-to-SQL accuracy and what the project costs:

Accuracy Benchmarks

On standard benchmarks (Spider, Bird), state-of-the-art text-to-SQL systems achieve 80% to 90% accuracy for simple queries (single table, basic aggregation) and 50% to 70% for complex queries (multi-join, subqueries, window functions). In practice, with a well-documented schema and few-shot examples specific to your database, you can hit 85% to 95% accuracy for the types of questions your team actually asks.

The key insight: your team asks the same 50 to 100 types of questions repeatedly. Optimize for those specific patterns, not for arbitrary SQL generation. A system that handles your team's actual questions 95% of the time is far more valuable than one that handles any possible question 70% of the time.

Project Costs

  • Prototype (2 to 3 weeks, $8,000 to $15,000): Basic text-to-SQL with your schema, single user role, simple query validation. Enough to test whether your team will actually use it.
  • Production system (4 to 8 weeks, $20,000 to $60,000): Full safety controls, role-based access, query caching, visualization layer, audit logging, and integration with Slack or your internal tools.
  • Enterprise deployment (8 to 14 weeks, $60,000 to $120,000): Multi-database support, complex access controls, custom fine-tuning, admin dashboard, and automated accuracy monitoring.

Ongoing LLM API costs are modest: $50 to $500/month for most teams, depending on query volume and model choice.

The biggest ROI comes from reducing the data team's ad-hoc query burden. If your analysts spend 20 hours per week pulling reports for other teams, and text-to-SQL handles 70% of those requests, you've freed up 14 analyst-hours per week for higher-value work.

Ready to give your business team direct access to data insights? Book a free strategy call and we'll assess whether text-to-SQL is the right fit for your data stack.

Need help building this?

Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.

AI data analysttext-to-SQLnatural language database queryLLM data analysisAI business intelligence

Ready to build your product?

Book a free 15-minute strategy call. No pitch, just clarity on your next steps.

Get Started