How to Build·14 min read

How to Build a Virtual Try-On App with AR for Retail in 2026

AR try-on lifts conversion 80-90% for apparel and beauty. Here's how to build a virtual try-on app that retail DTC brands can actually deploy.

Nate Laquis

Nate Laquis

Founder & CEO

Try-On ROI: Why Retail Is Racing to Ship AR in 2026

Virtual try-on has moved from novelty demo to measurable revenue driver. The brands investing most aggressively in AR are not the ones chasing buzz. They are the ones watching their conversion dashboards. Warby Parker attributes a meaningful share of its direct-to-consumer growth to its virtual try-on feature. Sephora's Virtual Artist has processed more than 200 million shade tries since launch. IKEA Place helped normalize furniture placement AR across an entire category. The numbers behind these deployments tell a consistent story.

Across apparel, beauty, eyewear, and home goods, AR try-on typically lifts add-to-cart rates 80 to 90 percent and reduces return rates 25 to 40 percent. For categories like apparel where returns cost 20 to 30 percent of revenue, a return rate reduction alone can pay for the entire AR investment in a single quarter. The conversion lift is secondary gravy. When shoppers can see how a product looks on their face, body, or in their living room, they buy with confidence and they keep what they buy.

Shopper using AR try-on in retail app

The cost curve has also collapsed. Building a try-on experience in 2019 required a custom computer vision team, proprietary rendering engines, and multi-million dollar budgets. In 2026, ARKit, ARCore, and MediaPipe handle the hard tracking problems natively on device. Shopify and Salesforce Commerce Cloud expose product data through clean APIs. Reality Composer Pro and USDZ asset pipelines let merchandising teams update 3D catalogs without engineering support. A focused try-on app for a single category now ships in 12 to 16 weeks at budgets that mid-market brands can justify without board approval.

This guide walks through how to actually build one. We will cover category selection, the face and body tracking pipelines, 3D asset production, commerce integration, performance targets, and how to measure whether your investment is working. If you are also evaluating broader commerce architecture, pair this with our guide on how to build an ecommerce app to understand where AR fits inside the larger buying funnel.

Product Category: Pick the Right Try-On Problem

Not all products benefit equally from AR try-on, and the engineering challenges vary dramatically by category. Before committing to a roadmap, decide which category you are solving for. The four dominant verticals each have different tracking requirements, asset pipelines, and expected conversion outcomes.

Eyewear is the easiest category to ship well. Face tracking via ARKit or MediaPipe Face Mesh delivers sub-millimeter landmark accuracy, and glasses are rigid objects that occlude predictably around the nose and ears. A competent team can ship a Warby Parker class eyewear try-on in eight to ten weeks. Expected conversion lift is 90 to 120 percent for shoppers who engage with the feature.

Beauty and makeup is the highest volume category. Lipstick, foundation, eyeshadow, and blush all rely on face segmentation plus real-time shader compositing rather than 3D geometry. L'Oreal ModiFace pioneered the stack, and Sephora Virtual Artist extended it to thousands of SKUs. The technical stack is face landmarks plus region masks plus GPU color math. Banuba and DeepAR both sell white-label SDKs that will get you to 80 percent of a custom build at 10 percent of the cost.

Apparel is the hardest category and also the biggest prize. Body tracking, garment physics, and size accuracy all have to work together. ARKit Body Tracking and MediaPipe Pose give you skeleton data. CLO 3D handles garment draping during asset creation. Real-time cloth simulation on device is still compromise territory. Most apparel try-on in 2026 uses still-image virtual dressing rather than live video, trading interactivity for visual fidelity.

AR try-on development workflow for retail apps

Furniture and home goods uses world tracking rather than body or face tracking. IKEA Place set the standard: detect a plane, place a scaled USDZ model, let the shopper walk around it. ARKit RoomPlan in iOS 17 and later generates full room geometry, which enables smarter placement and occlusion. The asset pipeline is the long pole. A sofa at 4K texture resolution with proper PBR materials can take a 3D artist two to three days per SKU, which is why most furniture retailers cap their AR catalog at bestsellers rather than trying to cover every SKU.

Face Tracking and the Makeup Pipeline

For beauty and eyewear, the core technical primitive is face tracking. ARKit's ARFaceAnchor returns 1220 vertices of face geometry at 60 frames per second on any iPhone with the TrueDepth camera, and Apple Vision Framework's VNDetectFaceLandmarksRequest covers devices without TrueDepth at reduced fidelity. On Android, ARCore Augmented Faces gives you 468 landmarks via MediaPipe Face Mesh. These APIs are mature, well-documented, and production-ready.

The makeup rendering pipeline stacks four layers. First, the face mesh establishes 3D topology. Second, region masks identify lips, eyelids, cheeks, and eyebrows using the landmark indices. Third, shaders composite the product's color, texture, and finish (matte, satin, shimmer, metallic) onto the underlying skin using blend modes that preserve skin tone and lighting. Fourth, the output is rendered back into the camera feed with temporal smoothing to prevent flicker during head movement.

Color accuracy is where most teams fail. Lipstick that reads as the right shade under office fluorescent lighting may look completely different under warm window light. The solution is automatic white balancing informed by the face mesh: sample neutral skin regions, estimate the illuminant color, then apply inverse correction before compositing the product. Sephora's team published a paper on this approach that is worth reading before you ship.

For texture and finish, glossy and metallic products require physically based shading with environment reflections. Apple's RealityKit and the newer MetalFX upscaler handle this efficiently, but you still need properly authored shader graphs per product finish. Budget one graphics engineer full time for two months to build a reusable shader library that your merchandising team can configure per SKU.

If you do not want to build from scratch, Banuba, DeepAR, and Snap Camera Kit all offer commercial SDKs with pre-built makeup effects. Licensing runs 2 to 5 cents per active user per month at scale, which is cheaper than building and maintaining a custom pipeline unless you are at tens of millions of users.

Body Tracking and Garment Simulation

Apparel is where AR try-on transitions from solved problem to active research. The full pipeline has four stages: pose detection, body measurement estimation, garment draping, and occlusion handling. Each stage has mature pieces and weak links.

Pose detection is solid. ARKit Body Tracking on iPhone Pro models returns a 91-joint skeleton at 60 frames per second. MediaPipe Pose works on any smartphone with reduced joint count but acceptable accuracy. Both give you enough skeletal data to attach garment geometry.

Body measurement is the hard part. Customers come in every shape, and a single skeleton tells you nothing about circumference. Leading apps ask the shopper to step back 6 to 8 feet and rotate 360 degrees, which lets the app reconstruct a parameterized body model (typically SMPL or SMPL-X from the Max Planck Institute). The reconstruction infers measurements accurate to within 2 centimeters, good enough for size recommendations but not tailored fitting.

Body tracking and garment simulation in retail AR

Garment draping happens primarily offline in the asset pipeline using CLO 3D or Marvelous Designer. These tools simulate cloth physics against a base body model, producing a family of garment meshes that the app can interpolate based on the shopper's reconstructed body shape. Real-time cloth simulation on device is possible using position-based dynamics, but it costs 3 to 5 milliseconds per frame and most apps skip it in favor of pre-baked animation.

Occlusion is where apparel AR still looks fake. When the shopper's arm crosses in front of their torso, the garment needs to render behind the arm. ARKit's people occlusion handles this via depth matting on devices with LiDAR, but quality degrades sharply on older hardware. For now, most production apparel try-on apps constrain the shopper to a front-facing pose and skip dynamic occlusion. The resulting experience feels more like a still dressing room than live video, which is an acceptable tradeoff for most shoppers.

The 3D Asset Pipeline: Photogrammetry vs Manual Modeling

Your AR app is only as good as its 3D catalog, and building that catalog is the expensive and unglamorous part of every try-on project. Budget more for assets than for code. The two primary production paths are photogrammetry and manual modeling, and most mature programs use both.

Photogrammetry captures an object by photographing it from 60 to 200 angles, then reconstructing geometry and textures algorithmically. Apple's Object Capture API in iOS 17 and later runs the full pipeline on a Mac in 10 to 20 minutes per object. The output is a USDZ file ready for Reality Composer Pro or Blender finishing. For rigid products with complex textures (handbags, shoes, bottles, jewelry), photogrammetry beats manual modeling on both cost and realism. Expect $50 to $150 per finished SKU at scale.

Manual modeling in Blender or Maya is required for products that photogrammetry handles poorly: reflective surfaces, transparent materials, fabrics with fine weave, and anything that deforms like clothing or upholstery. A competent 3D artist produces two to four finished SKUs per day depending on complexity. Rates run $40 to $90 per hour for contract work, landing most manual SKUs in the $200 to $600 range.

For apparel specifically, CLO 3D is the industry standard. Designers import their actual 2D patterns and sew them digitally onto a parametric mannequin, producing garment geometry that matches the physical product's fit. Many apparel brands already use CLO 3D for product development, which means the 3D assets needed for AR try-on are a byproduct of work that is already happening. Aligning the AR pipeline to the existing design workflow is the single biggest cost saver we see in apparel programs.

Regardless of production method, invest early in a 3D asset management system. Plug it into your PIM so that 3D files live alongside product data, version together, and flow through the same approval workflows as photography. Retailers that bolt AR assets onto ad-hoc Dropbox folders never get past a pilot.

Shopify, Magento, and Commerce Cloud Integration

An AR try-on app that does not connect cleanly to your commerce backend is a demo, not a product. The shopper needs to see live inventory, apply their loyalty discount, add the item to the same cart they use on the web and the main app, and check out with their saved payment method. Commerce integration is where most AR projects underestimate the work.

Ecommerce platform integration for AR try-on apps

For Shopify brands, the Storefront API exposes everything you need: products, variants, inventory, pricing, cart mutations, and checkout URLs. Build your AR app against a dedicated Storefront API token and treat it as a first-class sales channel. Shopify's hydrogen-style cart APIs let you hand off a populated cart to the web checkout, which preserves Shop Pay and every payment method the customer already has saved. Resist the temptation to build a native checkout in the AR app. Shoppers convert better on Shop Pay than on anything you will build.

For Magento merchants, GraphQL endpoints cover similar ground but with more configuration surface. Magento's product attribute flexibility is both an asset and a liability: your AR app needs to know which attributes drive variant selection (size, color, shade) and which are purely informational. Build a configuration layer in your PIM or a middleware service rather than hardcoding attribute names into the client app.

Salesforce Commerce Cloud (formerly Demandware) serves most enterprise retailers. The Shopper API (OCAPI or the newer SCAPI) covers products and cart, but cart handoff to the web is clunkier than Shopify. Most Commerce Cloud deployments rebuild native checkout in the AR app rather than fight the handoff. Budget an extra sprint for this.

Regardless of platform, the integration pattern that works best is a thin backend-for-frontend (BFF) service between your AR app and the commerce platform. The BFF handles authentication, caches product data, formats responses for the client, and isolates the AR app from commerce platform quirks. When you migrate platforms (and every retailer does eventually), you only rewrite the BFF, not the client. For a deeper look at commerce backend architecture, our guide on AI for ecommerce covers how product and customer data flow into personalization systems that also apply here.

Performance Targets and Device Compatibility

AR apps live or die on performance. A try-on experience that stutters, drains battery, or fails on older phones will not convert shoppers. The performance budget for a retail AR app is tighter than most teams initially assume.

Frame rate is the primary metric. Face tracking must hit 60 frames per second on every supported device. Body tracking can drop to 30 in worst case but 60 is strongly preferred. Anything below 30 feels broken, and shoppers close the app. Room-scale AR can tolerate 30 fps because the camera moves slower than a face or body.

Thermal performance is the hidden killer. iOS and Android both aggressively throttle when the device hits 40 degrees Celsius, and AR apps heat devices fast because they run the camera, neural engine, GPU, and network radio simultaneously. A try-on experience that works beautifully for the first 90 seconds and then degrades into stutter is worse than one that starts at 30 fps and holds. Budget for thermal profiling on real devices starting in week four, not week twelve.

Device compatibility in 2026 means iOS 17 and later for full feature support and iOS 15 and later for degraded fallback. On iOS, the split between devices with LiDAR (iPhone Pro models) and without matters for occlusion and room scanning. On Android, ARCore officially supports about 900 device models but real-world stability varies. Most retailers target iPhone first, Pixel and Samsung flagship second, and accept that the bottom 20 percent of Android hardware will get a reduced experience or no AR at all.

App size matters more than developers remember. A try-on app with a 200 SKU 3D catalog can easily balloon to 500 MB or more, which hurts install rates. Ship the app shell at under 100 MB and download 3D assets on demand, ideally over HTTP with aggressive caching. USDZ files compress well, and most assets land in the 2 to 8 MB range per SKU. Apple's on-demand resources API and Android's Play Asset Delivery both handle this pattern cleanly.

For the full cost breakdown of building and shipping production AR, our AR/VR app development cost guide covers budget ranges by scope and complexity.

Measuring Conversion Lift and Tying AR to Revenue

An AR feature that nobody measures is an AR feature that will be cut in the next budget cycle. Instrument conversion lift from day one, and tie every try-on session to downstream purchase data so that finance can see the return.

The core metric is AR-assisted conversion rate versus baseline. Segment shoppers into three buckets: saw the AR button but did not tap, tapped AR and engaged for at least 10 seconds, and tapped AR and added to cart from within the AR experience. The gap between the first and third bucket is your AR attribution window. In well-instrumented programs, that window shows 2 to 3x higher conversion for AR engagers, which translates to 20 to 40 percent incremental revenue on the products with AR coverage.

Return rate is the second key metric and often the more financially significant one. Tag every order placed after an AR session with a flag, and compare return rates 30, 60, and 90 days out against baseline. Expect 25 to 40 percent lower returns for AR orders in apparel and furniture, and 10 to 20 percent lower in beauty and eyewear.

Engagement telemetry matters for product decisions. Track session length, number of products tried per session, shade or variant switches, and share events. Shoppers who try on three or more products in a session convert at 3 to 5x the rate of single-product sessions, which tells your merchandising team to surface more recommendations inside the AR experience.

Tie everything back to revenue through your existing analytics stack. Segment and mParticle both handle AR event schemas well. Push engagement events into your data warehouse alongside order data so that lifetime value analysis can include AR as a cohort dimension. Shoppers acquired through AR-heavy campaigns often show 15 to 25 percent higher LTV than baseline, which justifies spending more on paid acquisition channels that lead with AR demos.

Once the instrumentation is in place, you can run A/B tests on everything: AR button placement, first-try product selection, tutorial flow, cart handoff timing. Expect the first six months after launch to deliver as much lift from optimization as the initial launch itself. AR try-on is not a set-it-and-forget-it feature. It is a compounding surface that keeps paying off as long as you keep tuning it.

If you are ready to ship an AR try-on experience that actually moves conversion, return rates, and LTV, we would love to talk through your category, your commerce stack, and your timeline. Book a free strategy call and we will map the fastest path to a production pilot your finance team will defend.

Need help building this?

Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.

virtual try-on app developmentAR try-on buildARKit face trackingapparel AR appbeauty virtual try-on

Ready to build your product?

Book a free 15-minute strategy call. No pitch, just clarity on your next steps.

Get Started