What an AI SDR Actually Does
An AI sales development rep takes a target customer profile, finds real humans that match it, researches them, writes a personalized outbound email, sends it, handles the replies, books meetings on behalf of an account executive, and updates your CRM. It is not a chatbot. It is not a sequence tool with a LinkedIn message generator bolted on. It is an autonomous agent that runs a full outbound workflow from ICP definition to meeting booked.
The companies raising real money in this category (11x, Artisan, Regie, Relevance AI, Clay) are charging $1,500 to $10,000 per month per deployed agent. Their customers report 2 to 5x the output of a human SDR at 30 to 60% of the cost. The numbers work. The technical execution is the hard part.
If you are building one of these systems, you need to solve four problems well: data quality, message quality, deliverability, and evaluation. Each one is a dedicated workstream. Skipping any of them means you ship a demo that impresses investors and then fails the moment a real customer tries to use it at scale.
Architecture Overview: The Five Core Services
An AI SDR is not a single prompt. It is a coordinated system of specialized services, each doing one thing well. Here are the five services that every production AI SDR needs.
Lead sourcing service. Pulls accounts and contacts matching the ICP from data providers (Apollo, Clay, ZoomInfo, People Data Labs). Handles deduping against existing CRM data, filters out accounts already in an active sequence, and scores leads by fit and intent signals.
Research and enrichment service. Takes a lead and gathers context: company news, LinkedIn activity, recent funding, tech stack (BuiltWith, Wappalyzer), job postings, podcast appearances, press mentions. This is the raw material the message generator uses for personalization.
Message generation service. An LLM-powered service that writes outbound emails, LinkedIn messages, and voicemail scripts. It takes the lead context and your messaging guidelines and produces copy that passes human review.
Sending and deliverability service. Manages email accounts, sending infrastructure, warmup, SPF/DKIM/DMARC, reply detection, and deliverability monitoring. This is the boring plumbing that makes or breaks your product.
CRM and orchestration service. The control plane. Talks to Salesforce, HubSpot, or Attio. Tracks sequence state, routes replies to the right human or AI handler, triggers follow-ups, and logs everything.
Our AI sales pipeline automation guide covers the broader revenue ops landscape that an AI SDR plugs into. This article focuses on the specific technical architecture of the agent itself.
Lead Sourcing and ICP Scoring
Data quality is the foundation. If your lead sourcing is bad, everything downstream compounds the problem. Here is how to get it right.
The data provider stack. No single vendor has complete coverage. Most serious AI SDRs use a blend: Apollo for firmographic breadth, Clay for enrichment orchestration, ZoomInfo for high-touch enterprise, People Data Labs for person-level enrichment, and LinkedIn Sales Navigator via Phantombuster or similar for real-time signals. Budget $2,000 to $15,000 per month in data provider costs per customer you are serving.
ICP definition. Do not ask customers to write a SQL query. Build an ICP builder UI that lets them define filters in plain language ("B2B SaaS companies with 50 to 500 employees that use HubSpot and raised a Series A in the last 18 months"). Translate to structured queries against the data provider APIs.
Lead scoring. Combine firmographic fit (company size, industry, geography), technographic fit (uses your target tech stack), and intent signals (hiring for relevant roles, recent funding, product launches, inbound traffic). Weight them based on what the customer's best-fit accounts look like. Train a small scoring model on the customer's closed-won data once you have enough of it.
Deduplication. Match new leads against existing CRM contacts by email, LinkedIn URL, phone, and fuzzy name plus company matching. A duplicate outreach to someone already in a sequence is a trust-destroying mistake. Be paranoid about deduping.
Suppression lists. Customers will provide do-not-contact lists (current customers, ex-employees, competitors, sensitive accounts). Respect them before anything else runs. Missing a suppression is a fireable offense.
Message Generation: Where LLMs Earn Their Keep
This is the part that everyone thinks is easy because "ChatGPT can write a sales email." It is not easy. Generic AI-written outbound has killed the open rates of everyone who leaned on it in 2024. Generation 2 of this problem is about quality.
The prompt architecture. Do not use a single massive prompt. Break message generation into stages: research summarization (LLM reads raw context and extracts 3 to 5 talking points), angle selection (picks the most compelling angle based on customer messaging), copy generation (writes the email), and self-review (a second LLM call checks tone, length, accuracy, and spam triggers).
Model choice. Claude Sonnet 4.5 or GPT-4o for the main copy generation step. They are consistently better at tone and nuance than cheaper models. Use Claude Haiku, GPT-4o-mini, or Gemini Flash for the simpler steps (summarization, self-review) to keep costs down. Expect $0.10 to $0.40 in total LLM cost per generated message.
Personalization that does not feel gross. The trap is using a specific detail to prove you researched the person but in a way that feels creepy ("I saw you posted about your dog on LinkedIn last Tuesday"). The right pattern is context-aware framing ("Given your recent focus on outbound automation, I thought you might find this interesting") without calling out specific personal details.
Human guidelines. Let customers define brand voice, banned words, approved phrases, target email length, CTA style, and tone preferences. Feed these into every generation call. The goal is that the AI sounds like the customer, not like ChatGPT.
Output format. Return structured output (JSON) from the LLM so you can separate subject line, preview text, body, and CTA. Post-process for length, spam triggers, and forbidden phrases before queuing for send.
Deliverability: The Boring Part That Kills Companies
You can have the best lead sourcing and the best AI copy in the world and still fail because your emails land in spam. Deliverability is the plumbing. It is not sexy and it is not what fundraising decks focus on, but it is where AI SDRs live or die.
Multi-inbox infrastructure. Do not send all your outbound from one domain or one inbox. Each customer needs multiple sending inboxes (typically 3 to 10) on dedicated sending domains. Use tools like Instantly, Smartlead, or build your own on top of Google Workspace or Microsoft 365 APIs. Expect $20 to $80 per inbox per month in costs.
Warmup. Every new inbox needs 14 to 30 days of warmup before it sends real outbound. Use an automated warmup service or build your own network of cooperating inboxes that send test emails to each other, open them, mark them as important, and reply. Skipping warmup is the fastest way to get a domain flagged.
SPF, DKIM, DMARC. Table stakes. Every sending domain needs all three records configured. DMARC should be at least "p=quarantine" with reporting enabled. If your customer does not know what these are, configure them for the customer. Do not let them ignore it.
Volume throttling. Never send more than 30 to 50 emails per day per inbox to cold recipients. Ramp volume gradually. Randomize send times within business hours. Add realistic delays between sends. The goal is to look like a human, not a bot.
Reply monitoring. Detect out-of-office, bounce, negative, positive, and forwarding replies. Handle each type differently. Out-of-office should trigger a delayed follow-up. Bounces should remove the lead and update data quality scores. Negative replies should stop the sequence immediately and add to suppression. Positive replies should route to human handoff or AI conversation follow-up.
Blacklist monitoring. Check sending IPs against blacklists daily (Spamhaus, Barracuda, SORBS). Rotate IPs when needed. Monitor spam complaint rates and bounce rates per domain.
CRM Integration and Orchestration
The CRM is the system of record. Your AI SDR is just a worker that operates on top of it. If your CRM sync is flaky, customers will lose trust in an hour.
Pick two CRMs for v1. HubSpot and Salesforce cover 70% of the B2B SaaS market. Start with HubSpot because the API is easier. Add Salesforce in month three. Attio, Pipedrive, and Close come later.
Bi-directional sync. The AI SDR reads from the CRM (existing contacts, accounts, opportunities, suppression lists) and writes to it (new leads, activities, notes, task creation for humans). Treat the CRM as the source of truth for account and contact data. Use your own internal state store for agent state and sequence progress.
Activity logging. Every email, every reply, every meeting booked should log as an activity in the CRM with the full message body, timestamp, and AI reasoning if possible. Humans need to be able to audit what the AI did. "Show me every message the AI sent this week" is a question every customer asks in week one.
Webhook reliability. CRM webhooks fail. Always poll as a backup. Use a durable job queue like Temporal or BullMQ with retries and dead letter queues. Every state change needs a retry mechanism.
Human handoff. When the AI gets a positive reply, it should schedule a meeting if possible, then route the conversation to a human account executive with full context. Integrate with Calendly or Chili Piper for booking.
If you are building voice-based AI agents alongside or instead of email agents, our AI voice agent build guide covers the additional complexity of real-time voice infrastructure and call routing.
Evaluation and Guardrails
The single biggest production risk with an AI SDR is that it sends an embarrassing, incorrect, or off-brand email and your customer finds out from their CEO. You need evaluation loops and guardrails to make that risk manageable.
Pre-send checks. Before any message leaves your system, run it through a checklist: under 150 words, no banned phrases, no PII leakage, no mention of competitors, no fabricated facts, proper formatting. A second LLM call can handle fuzzy checks (tone, brand voice, relevance). A regex and rules engine handles deterministic checks.
Human-in-the-loop mode. For new customers and high-stakes campaigns, every message should be queued for human review before sending. For trusted customers and low-risk campaigns, run in autonomous mode with periodic sampling by humans. Let customers toggle this per sequence.
Offline evaluation. Maintain a test set of 500 to 2,000 leads with known expected behavior. Run new model versions or prompt changes against the test set before deploying. Track win rate, personalization quality, and hallucination rate. Tools like Braintrust, Langfuse, or Helicone make this easier.
Online monitoring. Log every prompt, response, and downstream outcome. Tag conversations by quality when replies come in. Use this data to identify failure patterns and continuously improve prompts and examples.
Rollback strategy. Version your prompts, models, and messaging guidelines. When quality drops, you need to be able to revert to the last known good config in minutes, not hours.
For the agent architecture patterns more broadly, our AI agents for business guide covers the broader multi-agent and orchestration decisions that apply here.
Tech Stack and How to Launch
Here is the stack that we recommend for a production AI SDR built in 2026, plus the sequence for getting from zero to paying customers.
Backend. Python with FastAPI for the AI services (message generation, research, scoring). Node.js with Fastify for the CRM integration and orchestration layer. Postgres for everything relational. Redis for caches and queues. Temporal for long-running workflows and reliable retries.
LLM providers. Anthropic Claude and OpenAI GPT-4o as primary providers. Use a routing layer (OpenRouter, Portkey, or custom) so you can switch models without rewriting code. Budget for 20 to 40% of revenue in LLM costs in year one.
Data providers. Apollo, Clay, and People Data Labs for most customers. Add ZoomInfo as an upsell for enterprise buyers.
Email infrastructure. Smartlead or Instantly as the sending engine, or build on top of Google Workspace and Microsoft Graph APIs. Mailgun or Postmark for transactional emails (meeting confirmations, notifications).
CRM SDKs. HubSpot Node SDK, jsforce for Salesforce, official clients for Attio. Wrap them in a thin abstraction so you can add new CRMs without rewriting business logic.
Observability. Langfuse or Braintrust for LLM tracing and evaluation, Sentry for errors, Grafana Cloud for metrics and logs.
Frontend. Next.js 15 with Tailwind and shadcn/ui. TanStack Query for data fetching. Clerk or WorkOS for auth.
Launch sequence. Month 1 to 2: lead sourcing, message generation, Gmail-based sending for one pilot customer. Month 3 to 4: CRM integration (HubSpot), guardrails, reply handling, human review mode. Month 5 to 6: Salesforce, multi-inbox infrastructure, autonomous mode, first 5 to 10 paying customers. Month 7 to 12: scale, vertical specialization, and enterprise features.
Team size for v1: 3 to 5 engineers (2 backend, 1 LLM specialist, 1 full-stack, 0 to 1 devops), 1 designer, 1 founder doing product and sales. Total cost to a credible v1 is $300K to $700K depending on team seniority.
The AI SDR category is moving fast and the technical bar is rising every month. The winners will combine strong data quality, excellent copy, reliable deliverability, and relentless evaluation. Most teams underestimate the deliverability work and over-invest in prompt engineering. Balance matters.
If you are scoping an AI SDR build or trying to decide between building in-house and partnering with an existing platform, we help founders make these decisions every week. Book a free strategy call to walk through the architecture and trade-offs for your specific use case.
Need help building this?
Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.