Architecture Overview: Cloud CMS, Device Player, and Management API
A digital signage platform has three core pieces that need to work in concert: a cloud-based content management system where operators create and schedule content, a device player application running on physical hardware at each screen location, and a management API that ties everything together. Understanding this triangle is the foundation for every design decision you will make.
The cloud CMS is your operators' primary interface. It is where they upload media, build playlists, create screen layouts with multiple content zones, and schedule campaigns. Think of it as the "brain" that decides what content plays where and when. This needs to be a snappy web application with drag-and-drop UX, because your end users are often marketing teams or retail managers, not developers.
The device player is the "muscle." It is a lightweight application running on the actual hardware connected to each display. It receives content instructions from the cloud, caches media assets locally, renders the layout to the screen, and reports its health back to the management API. The critical design constraint here is that the player must work even when the internet connection drops. A restaurant menu board that goes blank during a Wi-Fi outage is unacceptable.
The management API sits between these two. It handles authentication, content distribution, device registration, health monitoring, and analytics collection. It exposes REST endpoints for the CMS frontend and uses MQTT or WebSocket connections for real-time device communication. In a well-designed system, the CMS never talks directly to devices. Everything flows through the API, which enforces permissions, validates data, and maintains the source of truth in the database.
Here is the high-level data flow: an operator creates a playlist in the CMS, the CMS sends it to the API, the API stores it in PostgreSQL, then publishes a "content update" message via MQTT to the relevant device topics. Each subscribed player receives the message, downloads any new media assets from the CDN, and transitions to the updated playlist. The player then reports "playlist activated" back through MQTT, and the API logs the proof-of-play event. This entire cycle should complete in under 30 seconds for a typical content update.
Content Management System Design: Media, Playlists, Scheduling, and Layouts
The CMS is where your platform lives or dies commercially. Competing signage platforms like Screenly, Yodeck, and Rise Vision all have capable players. What differentiates them is how intuitive the content management experience feels. Your goal is to let a non-technical store manager create a professional-looking display in under five minutes.
Media upload and processing. Support images (JPEG, PNG, WebP), video (MP4 with H.264/H.265), and web content (URLs rendered as iframes or via embedded Chromium). When a user uploads a video, your backend should transcode it into device-optimized formats using FFmpeg. A 4K video from a marketing agency needs to be downscaled to 1080p for most signage hardware. Store the original in S3 for archival, generate a 1080p rendition for playback, and create a thumbnail for the CMS preview grid. Use a job queue (BullMQ with Redis) to handle transcoding asynchronously so uploads feel instant.
Playlist creation. A playlist is an ordered collection of media items, each with a duration. The data model is straightforward: a playlist has many playlist items, each referencing a media asset with a display duration in seconds, a sort order, and optional transition effects (fade, slide, cut). Store this in PostgreSQL with a simple schema:
- playlists table: id, name, tenant_id, created_at, updated_at
- playlist_items table: id, playlist_id, media_id, duration_seconds, sort_order, transition_type
- media table: id, tenant_id, original_url, processed_url, thumbnail_url, type (image/video/web), file_size_bytes, duration_seconds (for video), metadata_json
Scheduling engine. This is where things get interesting. Operators need to schedule different content for different times: breakfast menu from 6am to 11am, lunch menu from 11am to 3pm, happy hour specials from 3pm to 6pm, and a default playlist for all other times. Your scheduler needs to handle recurring rules (every Monday through Friday), date ranges (Black Friday promotion from Nov 25 to Nov 30), priority levels (an emergency alert overrides everything), and timezone awareness (a chain with locations in Eastern and Pacific time zones).
Model schedules using a structure inspired by iCalendar's RRULE format. Each schedule entry contains: playlist_id, device_group_id, start_time, end_time, recurrence_rule (daily, weekly, monthly, or custom cron expression), priority (integer, higher wins), and timezone. The player evaluates schedules locally to determine which playlist to show at any given moment. This local evaluation is critical because it means the schedule keeps working even when the device is offline.
Content zones and layouts. Most real-world signage screens are not single-content displays. A retail screen might show a product video in the main area, a scrolling news ticker at the bottom, and a weather widget in the corner. You need a layout engine that divides the screen into zones, each running its own playlist independently. Define layouts as JSON templates with named regions and pixel coordinates. A layout for a 1920x1080 screen might have a "main" zone (0,0 to 1920,810), a "ticker" zone (0,810 to 1920,1080), and a "sidebar" zone (1440,0 to 1920,810). Each zone gets assigned its own playlist and schedule. The player renders each zone as an independent layer, composited on screen.
Device Communication: MQTT, WebSocket, and Offline-First Design
Communication between your cloud platform and thousands of deployed display devices is the most technically demanding part of the entire system. You are dealing with unreliable networks, devices behind NATs and firewalls, intermittent connectivity, and the hard requirement that screens must keep showing content no matter what happens to the internet connection.
MQTT for heartbeat and commands. MQTT is the right protocol here, and it is not even close. It was designed for constrained devices on unreliable networks. It uses minimal bandwidth, supports persistent sessions so devices can receive messages they missed while offline, and has built-in QoS levels. Use MQTT v5 with an enterprise broker like EMQX or HiveMQ. Each device subscribes to its own command topic (e.g., devices/{device_id}/commands) and publishes to a heartbeat topic (e.g., devices/{device_id}/heartbeat).
Heartbeat messages should fire every 30 to 60 seconds and include: device ID, current playlist ID, player version, system uptime, CPU temperature, memory usage, disk free space, and network signal strength. This telemetry feeds your device management dashboard. If a heartbeat stops arriving, your backend marks the device as "offline" after 3 missed intervals and can trigger an alert to the operator.
Commands flow the other direction. When an operator publishes new content, the API sends a command message to the device's MQTT topic with the updated playlist manifest. Other command types include: restart player, reboot device, take screenshot, update firmware, clear cache, and rotate display. Use QoS 1 (at least once delivery) for commands and QoS 0 (fire and forget) for heartbeats. The broker handles message persistence for offline devices automatically with QoS 1.
WebSocket for real-time dashboard updates. While MQTT handles device communication, your management dashboard needs real-time updates too. When a device comes online, goes offline, or reports an error, operators should see it immediately without refreshing the page. Use WebSocket connections from the browser to your API server. The API subscribes to device heartbeat topics on the MQTT broker and relays status changes to connected dashboard sessions over WebSocket. Socket.IO with Redis adapter works well here for horizontal scaling across multiple API server instances. For a deeper comparison of real-time transport options, see our real-time features guide.
Offline-first player design. This is non-negotiable. Your player must cache everything it needs to operate independently for days. When a playlist is assigned, the player downloads all media assets to local storage before switching to the new content. It stores the complete schedule definition locally. It evaluates the schedule using the device's local clock (synced via NTP when online). If the internet drops, the player keeps running the current schedule with cached assets. When connectivity returns, it syncs any queued proof-of-play events and checks for content updates. Design the local cache with an LRU eviction policy and reserve at least 2GB of disk space for media. On devices with limited storage (Raspberry Pi with a 16GB SD card), implement aggressive cache management that keeps only the assets needed for the current and next scheduled playlists.
Device Player Implementation: Chromium, Electron, and Caching Strategy
The device player is the piece of software that actually renders content on the physical display. It needs to be rock-solid reliable, capable of running 24/7 for months without intervention, and smart enough to handle edge cases like corrupted media files, out-of-memory conditions, and display resolution changes.
Chromium-based rendering. The modern approach to digital signage players uses Chromium as the rendering engine. HTML, CSS, and JavaScript are the most flexible content format available. You can render images, video, animations, live web pages, weather widgets, social media feeds, and interactive touch content all within the same engine. This means your content authoring tool in the CMS can generate standard web content that plays identically in the browser preview and on the physical device.
Electron for desktop-class hardware. If your target hardware runs a full desktop OS (Windows, Linux, or macOS), Electron gives you a Chromium renderer with Node.js integration. You get access to the file system for local caching, native MQTT client libraries, hardware APIs for display control (screen on/off, brightness, rotation), and the ability to run as a kiosk-mode application that prevents users from exiting. Package your player with Electron Forge. Use autoUpdater for over-the-air player updates. The typical Electron signage player bundle is 150 to 200MB, which is acceptable for devices with SSD storage.
CEF (Chromium Embedded Framework) for embedded hardware. For custom hardware or Android-based media players (common in commercial signage: BrightSign, Minix, or custom ARM boards), CEF gives you the Chromium rendering engine without Electron's overhead. On Android devices, use a native Android app with a WebView configured in kiosk mode. The Android Device Policy API or Samsung Knox (for Samsung commercial displays) lets you lock down the device so only your player app runs. This prevents end-user tampering, which is a real concern in public-facing deployments.
Caching strategy for offline playback. Your caching layer is what separates a professional signage platform from a glorified slideshow. Implement a three-tier cache:
- Manifest cache: The complete playlist and schedule definition, stored as JSON on the local file system. Updated whenever the cloud sends a new manifest via MQTT. The player reads this on startup to know what to display even before contacting the server.
- Media cache: All image and video files referenced by active playlists, stored on the local disk. Use content-addressable storage (filename is the SHA-256 hash of the file content) so you never download the same file twice even if it appears in multiple playlists. Before switching to a new playlist, verify all required media files are cached. If any are missing, keep playing the current playlist until the download completes.
- State cache: The player's runtime state, including which playlist item was last shown, cumulative play counts, and queued analytics events. Persist this to SQLite on the device. If the player crashes and restarts, it resumes from where it left off rather than restarting from the first item.
A critical implementation detail: never trust the network. Wrap every HTTP request to the CDN in retry logic with exponential backoff. If a media download fails three times, log the error, skip that item in the playlist, and try again on the next cycle. A single broken image should never bring down the entire display. Defensive coding in the player saves you from thousands of support tickets.
Device Management Dashboard: Monitoring, Control, and Remote Troubleshooting
When you have 500 displays deployed across 30 retail locations, you need a centralized dashboard that shows the health of every device at a glance. The management dashboard is what your platform's admin users and your own support team use daily. It needs to be fast, accurate, and actionable.
Health monitoring. Every device heartbeat (arriving via MQTT every 30 to 60 seconds) feeds into a real-time status board. Display each device as a card or row showing: online/offline status (green/red indicator), current content playing, last heartbeat timestamp, CPU temperature, memory and disk usage, network quality, and uptime since last reboot. Use a traffic-light system: green means healthy, yellow means a metric is approaching a threshold (disk usage above 80%, CPU temp above 70C), red means something needs attention (offline, disk full, player crashed). Aggregate views per location, per device group, and per tenant give operators the ability to spot problems before they get calls from store managers.
Remote restart and reboot. When a device misbehaves, the operator should be able to fix it without sending a technician. Expose two levels of restart: "restart player" (kills and relaunches the player application) and "reboot device" (issues an OS-level reboot command). Both are sent as MQTT commands. The player listens for restart commands and gracefully shuts down, while device-level reboot requires your player to have OS-level permissions. On Linux players, the player process runs under a user account with sudo permissions for the reboot command. On Android, use the Device Owner API. Log every remote action with the operator's identity for audit purposes.
Screenshot capture. This is the feature operators love most. They click "take screenshot" in the dashboard, and within 5 seconds they see exactly what is currently on the physical display. The player receives the screenshot command via MQTT, captures the current screen framebuffer (on Linux, use xdotool and scrot; on Electron, use BrowserWindow.capturePage()), uploads the image to S3, and sends back the URL in a response message. Display the screenshot inline in the dashboard. This eliminates the "is the screen actually showing the right content?" question that plagues signage deployments. Some platforms take this further with scheduled screenshot capture every 15 minutes for compliance verification.
Firmware and player updates. Over-the-air (OTA) updates are essential. You cannot ask a client to physically visit 200 locations to update player software. Design a staged rollout system: upload a new player version to the platform, assign it to a "canary" group of 5 devices, monitor for 24 hours, then roll out to the next 10%, then 50%, then 100%. Each device checks for available updates on startup and periodically (every 6 hours). When an update is available, the device downloads the new version, verifies its checksum, and applies it during an off-hours maintenance window to avoid disrupting live content. Always keep the previous version available for rollback. If the new version crashes on startup three times, automatically revert to the last known good version. This self-healing behavior is what makes the difference between a hobbyist project and a production-grade platform.
For patterns on building robust admin dashboards with role-based access and real-time data, check our multi-tenant admin dashboard guide.
Multi-Tenant Architecture and Analytics for SaaS
If you are building digital signage as a SaaS product (and you should, because the recurring revenue model is far superior to one-time license sales), you need multi-tenancy from day one. Retrofitting tenant isolation into a single-tenant architecture is one of the most painful refactors in software engineering. Do not make that mistake.
Tenant isolation model. Use a shared database with tenant_id columns on every table. This is the most cost-effective approach for signage SaaS, where most tenants are small to mid-size (10 to 500 devices). Add a tenant_id foreign key to: media, playlists, devices, users, schedules, analytics events, and every other domain table. Enforce tenant isolation at the application layer with middleware that injects the tenant_id into every database query. Use PostgreSQL Row Level Security (RLS) as a safety net. Even if your application code has a bug that forgets the tenant filter, RLS prevents cross-tenant data leakage at the database level. For your largest enterprise customers (5,000+ devices), offer a dedicated database schema or dedicated database instance as a premium tier.
MQTT topic isolation. Tenant isolation extends to your MQTT broker. Structure topics as tenants/{tenant_id}/devices/{device_id}/heartbeat and tenants/{tenant_id}/devices/{device_id}/commands. Configure MQTT ACLs so that devices can only subscribe to and publish on topics within their tenant's namespace. EMQX and HiveMQ both support dynamic ACLs backed by a database or HTTP webhook. This prevents a compromised device in one tenant's network from reading messages intended for another tenant's devices.
Proof of play analytics. Proof of play is the backbone of signage monetization, especially for advertising-supported networks. Every time a piece of content plays on a screen, the player logs an event: content_id, device_id, started_at, ended_at, and completion_percentage. These events queue locally on the device (in case of connectivity loss) and sync to the cloud in batches. Store proof-of-play data in a time-series optimized structure. For PostgreSQL, use partitioned tables by month. At scale (millions of play events per day), consider ClickHouse or TimescaleDB for the analytics store while keeping PostgreSQL for transactional data.
Audience measurement. Advanced signage platforms integrate cameras with anonymous audience analytics. A camera mounted on the display captures frames, and an on-device ML model (running on the edge, never sending raw video to the cloud) estimates viewer count, dwell time, approximate age range, and attention (are they looking at the screen or walking past). Intel OpenVINO or Google MediaPipe provide pre-trained models that run on low-power hardware. This data, aggregated and anonymized, lets advertisers measure campaign effectiveness. Be extremely transparent about privacy: process everything on-device, store only aggregate statistics, and comply with GDPR and local privacy regulations. Never store or transmit raw facial data.
Content performance analytics. Beyond proof of play, give operators insights into what content works. Track metrics like: average completion rate per media asset (do viewers watch the full 30-second video or does engagement drop at 10 seconds?), content change impact (did the new promotion increase dwell time?), and time-of-day performance patterns. Present these in a clean analytics dashboard with charts and exportable CSV reports. This data is what convinces customers to renew their subscriptions year after year, because it proves the signage is delivering measurable business value, not just filling wall space.
Tech Stack Recommendations and Getting Started
After building signage platforms for clients across retail, hospitality, and corporate environments, here is the stack I recommend for a team starting today. These choices optimize for developer productivity, operational reliability, and the ability to scale from 50 devices to 50,000 without re-architecting.
Admin CMS frontend: Next.js with TypeScript. Next.js gives you server-side rendering for fast initial loads, API routes for lightweight backend logic, and a massive ecosystem of UI components. Use Tailwind CSS for styling and a component library like shadcn/ui. The drag-and-drop playlist builder should use dnd-kit or react-beautiful-dnd. Media upload with progress indicators uses tus-js-client for resumable uploads. TypeScript is non-negotiable for a codebase this complex. You will have shared types between the CMS, the API, and the player, and type safety across all three prevents entire categories of bugs.
Management API: Node.js with Express or Fastify. Node.js is the natural choice because your team is already writing TypeScript for the frontend and the player. Fastify edges out Express on performance (2x throughput in benchmarks) and has a better plugin system. Use Prisma as your ORM for type-safe database queries. Structure your API with a clean service layer: controllers handle HTTP concerns, services contain business logic, repositories handle data access. This separation makes testing straightforward and keeps your codebase maintainable as it grows past 50,000 lines.
Database: PostgreSQL with Redis. PostgreSQL handles all transactional data: tenants, users, devices, playlists, schedules, and media metadata. Use JSONB columns for flexible schema fields like device metadata and layout definitions. Redis serves three roles: BullMQ job queue for media transcoding and scheduled tasks, caching layer for frequently accessed data (active playlist manifests, device status), and pub/sub bridge between your API servers for WebSocket message distribution. A single PostgreSQL instance handles up to about 5,000 devices. Beyond that, read replicas and connection pooling with PgBouncer extend your runway to 50,000+ devices before you need to consider sharding.
Media delivery: S3 and CloudFront. Store all media assets in S3. Serve them through CloudFront with aggressive caching headers (media files are immutable, because you use content-addressable filenames). Set Cache-Control to max-age=31536000. CloudFront edge locations mean a device in Tokyo downloads its media from a nearby edge server rather than crossing the Pacific to your US-East S3 bucket. Cost at scale: storing 1TB of media costs about $23/month on S3, and CloudFront transfer for 10,000 devices downloading a 500MB content update runs about $45. These numbers are trivial compared to the value delivered.
MQTT broker: EMQX. EMQX is open-source, handles millions of concurrent connections, supports MQTT v5 with all the features you need (shared subscriptions, topic aliases, message expiry), and has a built-in dashboard for monitoring. Run it as a 3-node cluster for high availability. For managed hosting, EMQX Cloud starts at $99/month and saves you operational overhead. HiveMQ is the main alternative, with stronger enterprise support but higher pricing.
Device player: Electron for Linux/Windows, native Android app for ARM devices. Share as much code as possible between platforms. The core player logic (schedule evaluation, cache management, analytics logging, MQTT communication) lives in a shared TypeScript library. The Electron wrapper and Android wrapper are thin shells that provide platform-specific APIs (screen capture, display control, kiosk mode) and render content using the Chromium engine on both platforms.
If you are planning a digital signage platform and want to understand budget expectations before diving into development, read our cost breakdown for building a digital signage platform. When you are ready to move from planning to building, book a free strategy call and we will map out your architecture, timeline, and MVP scope together.
Need help building this?
Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.