The Core Problem Both Architectures Solve
Two users open the same document. Both start typing at the same time. User A inserts "Hello" at position 5 while User B deletes characters 3 through 7. Without a conflict resolution strategy, the document diverges and corrupts. Every real-time collaboration system needs a deterministic answer to this question: when two people edit the same thing simultaneously, what should the final state be?
Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDTs) are the two dominant solutions. They have been competing for mindshare since the late 2000s, and each has won major production deployments. Google Docs runs on OT. Figma runs on CRDTs. Both work, but the engineering tradeoffs are dramatically different depending on your use case, your team size, your offline requirements, and your performance constraints.
This is not a theoretical comparison. We have built collaborative features using both approaches, and the choice affects everything from your server architecture to your document storage format to your sync protocol. If you pick the wrong one, you will either over-engineer a simple problem or paint yourself into a corner when you need to scale.
For context on the real-time transport layer underneath these algorithms, see our guide to building real-time features.
Operational Transformation: How It Works
OT was invented at Xerox PARC in 1989 and refined by the Jupiter collaboration system at Xerox in 1995. The core idea is elegant: when you receive a remote operation, you transform it against any local operations that have already been applied so that the remote operation "fits" the current document state.
Here is a concrete example. The document contains "ABCDE". User A inserts "X" at position 2, producing "AXBCDE". Meanwhile, User B deletes the character at position 4 (the "D"), producing "ABCE". When User A receives User B's delete-at-4 operation, OT transforms it: since User A already inserted a character before position 4, the deletion index shifts to 5. The transformed operation deletes the correct character, and both users converge on "AXBCE".
The Transform Function
The transform function takes two concurrent operations and returns adjusted versions of each that can be applied in either order and produce the same result. For text editing, you need transforms for insert-vs-insert, insert-vs-delete, delete-vs-insert, and delete-vs-delete. Each combination has edge cases around cursor positions, tie-breaking (what happens when two inserts target the same position), and range operations like formatting spans.
The Server Requirement
Classic OT requires a central server to establish a total ordering of operations. The server acts as the single source of truth for operation sequencing. Google Docs uses this model: every operation goes to Google's server, gets assigned a sequence number, gets transformed against any concurrent operations, and gets broadcast to all other clients. This is why Google Docs needs an internet connection to function.
ShareDB and Practical Implementation
ShareDB is the most widely used open-source OT library for JavaScript. It provides a server component (Node.js) that manages document state and broadcasts transformed operations to connected clients. ShareDB uses JSON OT (json0 and json1 types) that support text editing, object manipulation, and list operations. A basic ShareDB setup takes about a day to get running, but production hardening (reconnection, error handling, persistence) adds another 1 to 2 weeks.
The key limitation of ShareDB and OT in general: the transform function complexity grows quadratically with the number of operation types. If you have 5 operation types, you need 25 transform functions (every type against every other type). Adding rich text formatting, tables, embedded objects, and comments to a collaborative editor can push you past 15 operation types, requiring 225+ transform functions, each with subtle correctness requirements.
CRDTs: How They Work
CRDTs take a mathematically different approach. Instead of transforming operations after the fact, CRDTs design data structures so that concurrent operations can be merged in any order and always converge to the same state. There is no need for a central server to sequence operations, because the merge function is commutative, associative, and idempotent by construction.
For collaborative text editing, the dominant CRDT approach is to assign each character a unique, globally ordered identifier. Yjs uses a sequence CRDT where every character gets an ID based on the client that created it and a logical clock. When you insert a character between two existing characters, the new character's position is defined by its neighbors, not by a numeric index. This means insertions never conflict, because every character has a unique identity that persists regardless of what other users are doing.
Yjs: The Performance Leader
Yjs is the most popular CRDT library for JavaScript and the one we recommend for most collaborative editing projects. It uses an optimized encoding that represents runs of characters inserted by the same user as a single struct, dramatically reducing memory overhead compared to naive per-character CRDTs. Yjs supports shared types for text (Y.Text), arrays (Y.Array), maps (Y.Map), and XML fragments (Y.XmlFragment), making it suitable for everything from plain text to complex nested documents.
Performance numbers: Yjs can handle documents with millions of operations, merging 100,000 remote changes in under 50ms on modern hardware. Its binary encoding format produces document snapshots 5 to 10x smaller than equivalent JSON representations. For a detailed comparison of CRDT libraries, check our Yjs vs Automerge vs Liveblocks breakdown.
Automerge: The Correctness-First Approach
Automerge takes a more academic approach to CRDTs with a focus on correctness proofs and a clean API. It models documents as JSON-like trees where every node has a unique ID and a causal history. Automerge 2.0 (rewritten in Rust with WASM bindings) closed the performance gap with Yjs significantly, though Yjs still wins on raw throughput for text-heavy use cases.
Automerge's biggest advantage is its change history. Every mutation is stored as a discrete change object, making it straightforward to implement undo/redo, time travel, and audit logs. If your product needs a version history feature, Automerge gives you that almost for free.
Conflict Resolution Mechanics: The Deep Differences
The fundamental difference between OT and CRDTs is where conflict resolution logic lives. In OT, it lives in the transform functions. In CRDTs, it is baked into the data structure itself. This distinction has cascading consequences for your entire system.
Intention Preservation
OT's transform functions can be tuned to preserve user intent in ways that CRDTs struggle with. For example, when two users simultaneously bold the same word and italicize the same word, OT can handle this as two non-conflicting formatting operations. CRDTs can handle this too (it is a straightforward map merge), but more complex intention scenarios, like two users rearranging the same list, produce different results depending on the CRDT implementation.
Consider this scenario: User A moves item 3 to position 1, while User B moves item 3 to position 5. With OT, you can write a transform function that detects the conflict and applies a specific resolution policy (last-writer-wins, prompt the user, or merge both moves). With CRDTs, the resolution is determined by the data structure's merge semantics, which might duplicate the item or pick a winner based on actor IDs. You get less control over the outcome.
Convergence Guarantees
CRDTs have a mathematical proof of convergence: any two replicas that have received the same set of operations will have the same state, regardless of the order operations were received. OT's convergence depends on the correctness of your transform functions. Google spent years fixing convergence bugs in Google Wave's OT implementation, and the project still had edge cases where documents would diverge. Getting OT transform functions right for complex document models is genuinely hard.
Tombstones vs Garbage Collection
Most text CRDTs use tombstones for deletions. When a character is deleted, it is not actually removed from the data structure. It is marked as deleted but retained so that concurrent operations referencing it can still be applied correctly. Over time, tombstones accumulate and consume memory. Yjs mitigates this with struct compaction, but a document with 1 million total edits (including deletions) will use more memory than the same document represented as plain text. OT does not have this problem, because deletions actually remove data.
Architecture and Network Topology
This is where the two approaches diverge most sharply in practice. OT fundamentally requires a central server. CRDTs work with any network topology. Your architecture requirements should drive this decision more than any other factor.
OT: Client-Server Only
Standard OT algorithms assume a star topology where every client communicates through a central server. The server maintains the canonical document state, assigns operation sequence numbers, and performs transformations. Decentralized OT algorithms exist in academic papers (SOCT2, GOTO, ABT), but they are significantly more complex and have not seen production adoption. If you choose OT, you are committing to running a server that is always available during collaboration sessions.
This has scaling implications. Every active document needs server-side state: the current document, a buffer of recent operations for late-arriving transforms, and connection state for each client. For ShareDB, plan on 5 to 20 MB of server memory per active document, depending on document complexity and operation buffer size. At 10,000 concurrent documents, that is 50 to 200 GB of server RAM.
CRDTs: Flexible Topology
CRDTs work peer-to-peer, client-server, or any hybrid. Two clients can sync directly over WebRTC without any server involvement. A client can sync with a server, then go offline, make edits, and merge seamlessly when reconnected. Multiple servers can sync with each other without coordination. This flexibility is why CRDTs are the foundation of local-first software.
Yjs provides pluggable "providers" for different network transports: y-websocket for client-server WebSocket sync, y-webrtc for peer-to-peer sync, y-indexeddb for local persistence, and y-dat for Hypercore-based P2P sync. You can stack multiple providers on the same document, and Yjs handles deduplication automatically. This composability is powerful for products that need to work across different connectivity scenarios.
Offline Support
CRDTs handle offline editing natively. A user can disconnect, make extensive edits, reconnect hours later, and merge cleanly. The merge is deterministic and automatic. OT can support offline editing, but it requires buffering all local operations and transforming them against the server's operation history on reconnection. This is technically possible (Google Docs does limited offline editing), but the reconnection logic is complex and error-prone. If offline support is a core requirement, CRDTs are the clear winner.
Performance, Memory, and Document Size
Performance characteristics differ significantly between OT and CRDTs, and the right choice depends on your document profile: how large are your documents, how many concurrent editors, and how long do documents live?
Operation Throughput
OT processes each operation in O(1) time on the server, because the transform function operates on individual operation pairs. Client-side, applying a remote operation is also O(1). The bottleneck is the server, which must process operations sequentially for each document to maintain ordering. A single ShareDB process can handle roughly 5,000 to 10,000 operations per second per document before becoming a bottleneck.
CRDTs process operations slightly slower per-operation due to the overhead of maintaining unique IDs and causal metadata. Yjs inserts take O(log n) time where n is the document length, because it uses a skip list for position lookup. In practice, this overhead is negligible for documents under 100,000 characters. For very large documents (1M+ characters), the logarithmic factor becomes noticeable.
Memory Overhead
This is where CRDTs pay a real cost. A plain text document of 10,000 characters requires exactly 10 KB to store as a string. The same document stored as a Yjs CRDT requires 30 to 50 KB (3 to 5x overhead) due to character IDs, client metadata, and internal data structures. With tombstones from deleted text, the overhead can reach 10x or more for heavily edited documents.
Automerge's overhead is higher than Yjs. The same 10,000 character document in Automerge uses 80 to 150 KB, because Automerge stores the full operation history by default. You can compact history to reduce this, but it is still heavier than Yjs.
OT does not have inherent memory overhead. The document is stored as its native format (a string, a JSON object, whatever your schema is). Operation history is stored separately and can be pruned aggressively since you only need recent operations for active sessions.
Document Loading Time
Loading a CRDT document means replaying or deserializing the entire state, including metadata. A 1 MB Yjs document (representing maybe 200 KB of actual content) takes 10 to 30 ms to load and parse. An equivalent OT document loads the raw content (200 KB) in under 1 ms, then establishes a sync connection. For applications with many small documents that users switch between frequently (like a note-taking app), CRDT loading overhead is noticeable.
Decision Framework: When to Use Each
After building collaborative features across dozens of projects, here is the framework we use to decide between OT and CRDTs. It comes down to five questions.
1. Do you need offline editing?
If yes, use CRDTs. This is the single strongest differentiator. OT's offline story is bolted on and fragile. CRDTs handle it natively. If your users need to edit on planes, in basements, or in rural areas with spotty connectivity, CRDTs save you months of engineering pain.
2. Is your collaboration model simple text or rich documents?
For plain text or lightly formatted text (Markdown, code), both work well. For complex structured documents with tables, embeds, comments, and track changes, OT gives you more granular control over conflict resolution through custom transform functions. CRDTs can handle rich documents (Yjs powers multiple ProseMirror and TipTap collaborative editors), but edge cases in complex document models require more careful CRDT schema design.
3. How many concurrent editors per document?
For 2 to 20 editors, both approaches handle it comfortably. For 50+ editors on a single document (live events, classroom scenarios, collaborative brainstorming), CRDTs scale more gracefully because they do not require server-side operation sequencing. OT's server becomes the bottleneck at high concurrency because every operation must be sequenced through it.
4. Do you already have server infrastructure?
If you are building a standard SaaS app with an existing server, OT fits naturally into your architecture. The server is already there, and ShareDB or similar OT libraries plug into your Node.js backend. If you want to avoid running collaboration-specific servers, CRDTs let you use serverless architectures or peer-to-peer connectivity with optional relay servers. For teams building collaborative whiteboard apps or canvas-based tools, the peer-to-peer option is particularly attractive.
5. What is your team's expertise?
OT is conceptually simpler for a basic implementation but harder to extend. CRDTs have a steeper initial learning curve but are more composable once understood. If you have a small team with limited distributed systems experience, start with a CRDT library like Yjs that handles the complexity for you. Writing correct OT transform functions from scratch is a specialized skill.
The Quick Summary
- Choose OT if you have a centralized server, need fine-grained conflict resolution control, are building rich text editing with complex formatting, and do not need offline support
- Choose CRDTs if you need offline editing, want peer-to-peer sync, are building a local-first app, need to scale to many concurrent editors, or want architectural flexibility
- Choose a managed service (Liveblocks, Convergence) if you want collaboration features without managing the infrastructure yourself
Real-World Production Examples
The best way to evaluate these architectures is to look at what production systems actually use and why they made their choices.
Google Docs: OT at Google Scale
Google Docs uses OT with a centralized server that sequences all operations. Google chose OT in 2006 (based on their acquisition of Writely) and has invested heavily in making it work at scale. Their OT implementation handles rich text, tables, images, comments, suggestions, and dozens of other content types. The transform function library is reportedly one of the most complex pieces of code at Google. They make it work because they have hundreds of engineers maintaining it.
Figma: CRDTs for Design Collaboration
Figma uses a custom CRDT (not Yjs or Automerge) optimized for their specific data model: a tree of design objects with properties. Their CRDT handles concurrent moves (reparenting objects in the layer tree), property changes (two users changing the same fill color), and deletions. Figma chose CRDTs because their data model is a tree of objects rather than a linear text sequence, and CRDTs map more naturally to tree structures. They also value the mathematical convergence guarantee over OT's reliance on correct transform functions.
Linear: CRDTs for Issue Tracking
Linear uses CRDTs (built on Automerge concepts) for their issue tracking data, enabling offline support and instant UI updates. Their local-first architecture means the app feels fast even on slow connections, because all reads and writes happen against the local CRDT state.
Notion: Custom OT-Like System
Notion uses a custom conflict resolution system that borrows from both OT and CRDTs. Their block-based document model (where each paragraph, heading, or embed is a discrete block) reduces conflict surface area. Concurrent edits to different blocks never conflict. Concurrent edits within the same block use an OT-like resolution strategy.
What This Tells You
The companies with the most complex document models and the largest engineering teams tend toward OT (or OT-like approaches), because they need control over conflict resolution semantics. Companies building local-first products or with simpler data models lean toward CRDTs for their architectural flexibility and offline support. There is no universally correct answer.
If you are planning a collaborative product and want to move fast without spending months on infrastructure, we can help you evaluate the right architecture for your specific use case and get a working prototype in weeks. Book a free strategy call and we will walk through the tradeoffs together.
Need help building this?
Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.