Designing a scalable real-time conversational platform

Case Study · Real-time · WebSocket · Scaling

I led engineering on a real-time messaging product that required low-latency updates, multi-tenant isolation, and predictable scaling. The project combined event-driven architecture with pragmatic operational controls to deliver a reliable product under load.

Problem

The existing system used polling and suffered from high database load and inconsistent message delivery at scale. Customers experienced delays and out-of-order events during busy periods.

Solution

We introduced WebSockets backed by a lightweight broker and a message queue for durability. Each tenant received scoped channels and rate limits. We separated read-heavy workflows into eventual-consistency reads with cache priming and used worker queues for heavy processing tasks.

Work done

The work included schema changes to support sharding, adding Redis channels for pub/sub, deploying horizontally-scaled websocket nodes, and creating graceful reconnect logic to handle transient failures. Monitoring and synthetic tests were added to detect message lag and backpressure.

Impact

Back to portfolio