Skip to main content

Part of topic: Next.js Architecture, Node.js & Backend Systems

Most Developers Don’t Understand Failure — And It Shows

Most Developers Don’t Understand Failure — And It Shows

October 3, 2024 8 min read Next.js, Socket.io, Real-time, Collaboration, MongoDB

Most Developers Don’t Understand Failure — And It Shows is a Developer Journal article by Ancel Ajanga on https://ancel.co.ke. Exploring the challenges and solutions in building a real-time collaborative project management platform. From WebSocket connections to optimistic UI updates and conflict resolution. It focuses on: Distributed teams face stale data, edit conflicts, and fragmented communication. Ancel Ajanga (Software Engineer at Maxson Programming Limited) authored this piece from production engineering wo…

Exploring the challenges and solutions in building a real-time collaborative project management platform. From WebSocket connections to optimistic UI updates and conflict resolution.

Who this is for

Teams and product leads who need real-time collaboration (editing, boards, sync) without conflicts, and engineers evaluating Next.js + Socket.io for production.

Problem

Distributed teams face stale data, edit conflicts, and fragmented communication. Off-the-shelf tools often struggle with scalability and real-time consistency.

Business outcome

A single platform that keeps everyone in sync with sub-second latency, supports 100+ concurrent users per project, and reduces coordination overhead.

Metrics

  • Sub-500ms real-time sync
  • 100+ concurrent users per project
  • 99.9% uptime with graceful reconnection

Hook Most software is built on a house of cards. Here is how I learned that the hard way.

Problem When you leave the safe zone of tutorial applications, concurrency and memory constraints hit hard.

Struggle I battled bizarre edge cases for weeks. My initial assumptions were wrong, and the framework defaults only made it worse.

## Solution By abandoning the 'best practices' and adopting a pragmatic, data-driven architecture, I finally broke through the bottleneck.

Insight Building real systems teaches you that elegant code is secondary to robust architecture.

Explore Projects

Trade-offs

I chose last-write-wins over OT/CRDTs for conflict resolution to ship faster; for higher concurrency we'd add version vectors. MongoDB over PostgreSQL for flexible schema evolution during rapid iteration; we traded full ACID for write throughput. Socket.io over raw WebSockets for built-in reconnection and room management.

Challenges faced

WebSocket reconnection under flaky networks required exponential backoff and client state reconciliation. Keeping the real-time channel independent from REST failures so one broken API call didn't block live updates. Scaling to 100+ users per project needed MongoDB indexing and small event payloads.