How to Build·15 min read

How to Build AI-Powered Search for Your Ecommerce Platform

Traditional keyword search loses revenue every time a shopper types a natural query and gets zero results. This guide walks through building AI-powered ecommerce search from the ground up, covering vector embeddings, hybrid retrieval, query understanding, and the tools that make it production-ready.

Nate Laquis

Nate Laquis

Founder & CEO

Why Ecommerce Search Is Broken (and What AI Changes)

Site search is the highest-intent touchpoint on any ecommerce platform. Shoppers who use the search bar convert at 2 to 3x the rate of those who browse. Yet most ecommerce search still runs on keyword matching, a technology that has not fundamentally changed since the early 2000s. A customer searching for "lightweight running shoes for flat feet under $120" gets a results page full of every product containing the word "running" or "shoes," sorted by some arbitrary relevance score that has nothing to do with what the shopper actually wants.

The cost of bad search is staggering. Baymard Institute research shows that 70% of ecommerce search engines fail to return useful results for product-type synonyms. When a shopper types "notebook" meaning a laptop, they get stationery. When they type "sofa" but your catalog uses "couch," they get zero results. Every failed search is a lost sale, and most shoppers will not try a second query.

AI-powered search fixes this by understanding intent, not just matching strings. It uses vector embeddings to capture the semantic meaning of both queries and products, so "lightweight running shoes for flat feet under $120" actually retrieves lightweight, supportive running shoes in the right price range. It handles typos, synonyms, and natural language without manual configuration, and it learns from user behavior to improve ranking over time.

Shopper completing a purchase on a mobile ecommerce platform with search visible

This guide covers the full architecture: vector embeddings for your product catalog, hybrid retrieval combining semantic and keyword signals, query understanding pipelines, personalized ranking, visual search, and the specific tools you need. Whether you are upgrading an existing Elasticsearch setup or building from scratch, you will walk away with a clear blueprint.

Semantic Search vs. Keyword Search: The Core Difference

Keyword search works by tokenizing a query into individual terms, then matching those tokens against an inverted index of product titles, descriptions, and attributes. The standard algorithm behind this is BM25, a ranking function that scores documents based on term frequency, inverse document frequency, and document length. BM25 is fast, well-understood, and surprisingly effective when queries are short and precise. It falls apart when queries are natural language, when vocabulary mismatches exist between the query and the catalog, or when intent matters more than exact words.

Semantic search takes a fundamentally different approach. Instead of matching tokens, it converts both queries and product data into dense vector representations (embeddings) in a high-dimensional space. Products and queries that are semantically similar end up close together in this space, even if they share zero words. A query for "summer dresses under $50" lands near lightweight, affordable dresses in the embedding space regardless of whether the product description uses the word "summer."

The embedding models that power this have improved dramatically. OpenAI's text-embedding-3-small, Cohere's embed-v3, and open-source models like BGE-large all produce high-quality embeddings for ecommerce use cases. The key is that you embed both your product catalog and incoming queries with the same model, then find the nearest product vectors for each query vector using approximate nearest neighbor (ANN) search.

Where keyword search still wins

Semantic search is not universally better. For exact SKU lookups ("Nike Air Max 90 CW7483-100"), brand name searches, and highly specific attribute queries ("size 10 wide"), keyword search with proper field boosting outperforms vector search. The product title contains the exact string the shopper typed. No semantic interpretation needed. This is why production ecommerce search systems almost always use a hybrid approach, combining both methods. We will cover hybrid architecture in the next section.

For a deeper look at how AI search compares to traditional keyword-based systems, see our guide on how to build AI search.

Building a Hybrid Search Architecture (BM25 + Vectors)

The most effective ecommerce search systems combine keyword retrieval and semantic retrieval in a hybrid pipeline. The idea is straightforward: run both a BM25 query and a vector similarity query against your product catalog, then merge the results using a fusion algorithm that balances both signals. This gives you the precision of keyword matching for exact queries and the recall of semantic search for natural language queries.

The retrieval pipeline

A typical hybrid pipeline works in three stages. First, the query hits both a BM25 index (Elasticsearch, OpenSearch, or Meilisearch) and a vector index (pgvector, Qdrant, Pinecone, or Weaviate) in parallel. Second, a fusion layer merges the two result sets. The most common fusion algorithm is Reciprocal Rank Fusion (RRF), which assigns a score to each product based on its rank in each result set, then sorts by the combined score. Third, a reranking model (optional but powerful) rescores the top 50 to 100 merged results using a cross-encoder that considers the full query-product pair for higher accuracy.

Choosing your vector store

  • pgvector: If you already run PostgreSQL, this is the lowest-friction option. It handles catalogs up to 500K products without issues. Add an HNSW index for fast ANN queries. Cost: just your existing database infrastructure.
  • Qdrant: Purpose-built vector database with excellent filtering support, critical for ecommerce where you filter by category, price, availability, and brand alongside similarity. Open source with a managed cloud option starting at $25/month.
  • Pinecone: Fully managed, zero-ops vector search. Starts at $70/month for production workloads. Scales well but costs add up at high query volumes.
  • Weaviate: Hybrid search built in natively. Supports both vector and keyword search in a single query without external fusion logic. Great developer experience. Open source with a managed option.

Fusion tuning

The balance between keyword and vector signals is not fixed. For "Nike Air Max 90," BM25 should dominate because the shopper typed an exact product name. For "comfortable shoes for standing all day," vector search should lead. Smart implementations detect query type and adjust fusion weights dynamically. A simple heuristic: if the query contains a recognized brand name or SKU pattern, set the BM25 weight to 0.7. For longer, descriptive queries, set the vector weight to 0.7.

For a detailed comparison of search engines that support hybrid architectures, read our breakdown of Algolia vs. Meilisearch vs. Typesense.

Query Understanding: Typo Tolerance, Synonyms, and Intent Detection

Even the best retrieval pipeline is only as good as its understanding of what the shopper actually meant. Query understanding is the preprocessing layer that transforms a raw search input into a clean, enriched query before it ever hits the index. This layer alone can improve search relevance by 20 to 30%.

Typo tolerance and fuzzy matching

Shoppers make typos constantly, especially on mobile. "Addidas" instead of "Adidas." "Samsonite" misspelled as "Samsunite." "Nikey" for "Nike." Your search engine needs to handle these gracefully. Most modern search engines (Algolia, Typesense, Meilisearch) include typo tolerance out of the box using edit-distance algorithms (Levenshtein distance). If you are building custom, implement a character n-gram index that matches partial strings, or use a spell-correction model trained on your query logs.

One subtlety: typo tolerance should be stricter for short queries (1 to 2 characters of edit distance) and more lenient for long queries. A one-character typo in a three-letter query changes the meaning entirely, while a two-character typo in a 15-character brand name is almost certainly unintentional.

Synonym expansion

Vocabulary mismatch is one of the top reasons ecommerce search fails. Your catalog says "sofa," the shopper types "couch." Manual synonym lists help but do not scale. AI-driven synonym detection analyzes query logs and click patterns to discover that shoppers who search "hoodie" and "sweatshirt" click on the same products. This behavioral signal is more reliable than any curated list. Algolia and Elasticsearch support manual synonym dictionaries. For automated discovery, cluster query embeddings and identify queries that produce similar click distributions. Searchspring and Bloomreach include this capability natively.

Intent detection and query classification

Not all search queries are product searches. "Return policy," "track my order," and "do you ship to Canada" are navigational or informational queries that should route to content pages, not product results. A simple classifier (even logistic regression trained on labeled query data) can separate product queries from support queries from navigational queries.

For product queries, intent detection goes further. "Show me summer dresses under $50" contains both a product intent (summer dresses) and a price constraint ($50). A Named Entity Recognition (NER) model or a lightweight LLM prompt can extract structured attributes from natural language queries: product type, color, size, price range, brand, material, use case. These extracted attributes filter and boost retrieval results. This is how AI search turns "red leather boots size 8 under $200" into a precise, filtered query.

Personalized Ranking and Behavioral Signals

Two shoppers typing the same query should not always see the same results. A returning customer who consistently buys premium brands should see premium products ranked higher. A bargain shopper should see deals first. Personalized ranking reorders search results based on individual user behavior, and it is one of the highest-leverage improvements you can make after basic semantic search is in place.

What behavioral signals to collect

  • Click-through data: Which products a user clicks from search results. High click-through on a product for a given query is a strong relevance signal.
  • Add-to-cart events: Stronger than clicks. A shopper who adds a product to their cart from search results is signaling clear purchase intent for that product type.
  • Purchase history: The strongest signal. Past purchases reveal brand preferences, price sensitivity, size preferences, and category affinity.
  • Dwell time: How long a shopper spends on a product page after clicking from search. Short dwell times suggest the result was not relevant.
  • Search refinements: When a shopper modifies their query, the original was likely unsatisfying. Track changes and eventual clicks.

Building the personalization layer

The simplest approach is a feature-based reranking model. Take the top 50 to 100 results from your hybrid retrieval pipeline, then rerank them using a learning-to-rank (LTR) model that includes personalization features: user-brand affinity, user-category affinity, price range match, and interaction history. XGBoost or LightGBM work well here. Train the model on historical click and conversion data.

Analytics dashboard showing ecommerce search conversion metrics and user behavior data

For more sophisticated personalization, build user embedding vectors that capture shopping behavior in the same vector space as your product embeddings. At query time, bias the vector search toward products close to both the query vector and the user vector. This works particularly well for returning visitors with rich behavioral histories.

Cold-start users (first-time visitors) can still benefit from personalization using contextual signals: device type, geographic location (seasonal relevance varies by region), referral source (a visitor from a luxury fashion blog has different preferences than one from a coupon site), and time of day. These weak signals still produce measurable ranking improvements.

To understand how AI personalization fits into a broader ecommerce strategy, see our guide on AI for ecommerce.

Visual Search and Natural Language Product Queries

Text-based search, even smart semantic search, only captures part of how people discover products. Visual search lets shoppers photograph an item they like (a friend's jacket, a piece of furniture in a magazine, a pair of shoes on the street) and find similar products in your catalog instantly. Natural language queries let them describe what they want in conversational terms. Both capabilities expand the search interface beyond the traditional keyword box.

Implementing visual search

Visual search works by generating an image embedding from the uploaded photo, then running a nearest-neighbor search against pre-computed image embeddings of your catalog. The key models are CLIP (by OpenAI) and SigLIP (by Google), both producing embeddings in a shared text-image space. This enables cross-modal search: a text query can retrieve results ranked by visual similarity, or a photo retrieves results ranked by text-description similarity.

The implementation steps are concrete. First, generate image embeddings for every product photo in your catalog using CLIP or SigLIP and store them in your vector database (Qdrant, Weaviate, Pinecone, or pgvector). Second, build an upload endpoint that accepts a photo, generates its embedding, and queries the vector store. Third, add a camera icon to your search bar that triggers the device camera. The round-trip from upload to results should take under 500ms. Google Cloud Vision API and Amazon Rekognition offer managed alternatives if you prefer not to self-host models.

Natural language product queries

Beyond simple keyword and semantic search, shoppers increasingly expect to type full sentences: "show me summer dresses under $50," "waterproof hiking boots that work in snow," or "gift ideas for a 10-year-old who likes science." Handling these queries requires a query understanding pipeline that combines NER (extracting price, size, color, use case) with semantic retrieval.

The most effective approach uses a lightweight LLM (GPT-4o-mini, Claude 3.5 Haiku, or Gemma) as a query parser. The LLM extracts structured filters (price_max: 50, category: dresses, season: summer) and a semantic intent string ("casual summer dresses"). Your pipeline applies the filters to narrow the catalog and uses the semantic string for vector retrieval within that set. The LLM call adds 100 to 200ms and is dramatically more accurate than handling complex queries with embeddings alone.

Developer writing search query parsing code on a monitor

A practical consideration: cache parsed query structures for common queries. If "summer dresses under $50" is searched 200 times a day, you do not need to call the LLM each time. A simple hash-based cache reduces LLM costs by 60 to 80% for ecommerce search workloads, where query patterns are highly repetitive.

Search Analytics: Measuring What Matters

You cannot improve what you do not measure. Search analytics tell you where your search engine is failing, what shoppers want that you do not carry, and which improvements will move revenue the most. Every ecommerce search implementation should ship with an analytics layer from day one.

Core metrics to track

  • Search conversion rate: The percentage of searches that lead to a purchase within the same session. This is your north-star metric. Industry average is 2 to 4%. AI-powered search should push this to 5 to 8%.
  • Zero-result rate: The percentage of queries that return no products. This should be under 5%. Every zero-result query is a direct failure. Track these queries and review them weekly.
  • Click-through rate on first result: If your top result is not getting clicked, your ranking is wrong. Target 30%+ CTR on the first result for branded queries, 15%+ for generic queries.
  • Search exit rate: How often shoppers leave the site immediately after seeing search results. High exit rate means the results page itself is a problem, either irrelevant results or poor UX.
  • Revenue per search: Total revenue from search sessions divided by total searches. Connects search quality directly to business outcomes.
  • Query refinement rate: How often shoppers modify their query. High refinement rates signal unsatisfying initial results.

Using analytics to improve search quality

The most actionable report is the "top zero-result queries" list. These are products or categories your customers want that your search cannot find. Some represent catalog gaps (you do not carry the product). Others represent vocabulary mismatches (synonyms you have not mapped). Review this list weekly and you will catch issues that directly impact revenue.

The second most valuable report is "high-search-volume, low-conversion queries." These are queries where shoppers search frequently but rarely buy. The products exist in your catalog, but the ranking is off. Fixing the ranking for just the top 20 of these queries often produces a measurable revenue lift because they represent concentrated search volume.

Tools like Algolia Analytics, Searchspring, and custom dashboards built on Amplitude or Mixpanel handle search analytics well. If you are running a custom stack, log every search event (query, results returned, results clicked, conversion outcome) to a data warehouse and build reports in your BI tool.

Choosing Your Stack: Tools and Architecture for Production

The right tooling depends on your catalog size, query volume, team capabilities, and budget. Here is a breakdown of the most common approaches, from fully managed to fully custom.

Algolia NeuralSearch

Algolia added semantic search (NeuralSearch) on top of its keyword platform. It handles hybrid retrieval, typo tolerance, synonyms, and analytics in a single managed service. Pricing starts at $1 per 1,000 requests. Best for teams that want fast time-to-value without managing infrastructure. Works well for catalogs up to 1M products. The tradeoff: limited ranking customization and higher costs at scale ($5,000 to $15,000/month for high-traffic stores).

Typesense

Open-source search engine with built-in vector search, sub-10ms query latency, and native typo tolerance and faceting. Self-hosting on a 3-node cluster costs $300 to $800/month. Typesense Cloud starts at $60/month. Best for teams that want managed-engine speed with open-source flexibility.

Vespa

Originally built by Yahoo for web-scale search, Vespa handles hybrid search, machine-learned ranking, and real-time indexing in a single platform. It is the most powerful option here but also the most complex. Vespa excels with large catalogs (millions of products), complex ranking models, and teams with search engineering expertise. Self-hosting costs $2,000 to $8,000/month. Vespa Cloud offers a managed option.

Custom with pgvector + Elasticsearch

If you already run PostgreSQL and Elasticsearch, you can build hybrid search without new infrastructure. Use Elasticsearch for BM25 keyword search and pgvector for vector similarity. Write a thin fusion layer that merges results using Reciprocal Rank Fusion. This is the cheapest to operate (existing infrastructure plus embedding API costs of $50 to $200/month) but requires more engineering effort. Best for teams with strong backend engineers and existing search infrastructure.

Recommended architecture for most ecommerce platforms

  • Query understanding layer: lightweight LLM (GPT-4o-mini or Claude 3.5 Haiku) for intent detection and attribute extraction, with Redis caching for repeated queries
  • Retrieval layer: Typesense or Qdrant for hybrid keyword + vector search, with product embeddings generated by Cohere embed-v3 or OpenAI text-embedding-3-small
  • Reranking layer: cross-encoder model (Cohere Rerank or a fine-tuned BERT cross-encoder) to rescore the top 100 results
  • Personalization layer: XGBoost learning-to-rank model with user behavioral features, served via a lightweight microservice
  • Analytics layer: event logging to BigQuery or Snowflake, dashboards in Looker or Metabase, weekly automated reports on zero-result and low-conversion queries

Total infrastructure cost for a mid-market ecommerce platform (100K products, 500K monthly searches): $500 to $2,000/month. Implementation timeline: 6 to 10 weeks for a team of 2 to 3 engineers.

Ready to Build AI-Powered Search for Your Store?

Building production-grade AI search requires getting the architecture, tooling, and ranking signals right from the start. At Kanopy, we have built search systems for ecommerce platforms ranging from 10,000 to 5 million products. Book a free strategy call and we will map out the fastest path from your current search to an AI-powered system that converts.

Need help building this?

Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.

AI ecommerce searchAI-powered searchecommerce search enginesemantic search ecommerceproduct discovery AI

Ready to build your product?

Book a free 15-minute strategy call. No pitch, just clarity on your next steps.

Get Started