Distributed Systems

Definition

A Distributed System is a collection of independent computers that appears to its users as a single coherent system. These components are located on different networked nodes and communicate via message-passing to achieve a common goal.

Why It Matters

We no longer live in a world where a single computer can hold the sum of human data. Distributed systems are the invisible infrastructure of the modern world; if they fail to handle partial outages or network latency correctly, global financial markets, communication networks, and cloud services collapse instantly. Understanding these systems is the difference between building a service that crumbles under its first million users and building one that is globally resilient, scalable, and practically indestructible.

Core Concepts

Concurrency: Components execute simultaneously and potentially communicate to synchronize state.
Partial Failure: One node may fail while others continue to operate. The system must be designed for fault tolerance and graceful degradation.
No Global Clock: There is no single, global notion of the “correct time”; synchronization must be achieved through logical clocks (e.g., Lamport timestamps) or consensus algorithms.
CAP Theorem: In a distributed data store, it is impossible to simultaneously provide more than two out of three guarantees: Consistency, Availability, and Partition Tolerance.
Consensus Algorithms: Mechanisms (like Paxos or Raft) used to achieve agreement on a single data value among distributed processes.

Definition

Why It Matters

Core Concepts

Connected Concepts

Connected notes