Input Data Distributions

Definition

Input Data Distributions are theoretical probability functions used to represent the stochastic behavior of system variables (e.g., arrival intervals, service times, failure rates) in a simulation model.

Why It Matters

The real world doesn’t operate on “averages.” Designing a system (like a hospital or a website) based on the “average” number of visitors will cause it to crash during the inevitable “peaks” of high demand. Probability distributions allow us to model the “chaos” and “variability” of reality, ensuring that our designs are robust enough to handle the worst-case scenarios, not just the “normal” ones.

Core Concepts

Common Continuous Distributions:
- Exponential: Models interarrival times for random (Poisson) processes. Defined by a single parameter ( $\lambda$ $λ$ or mean).
  - How to read: “The parameter lambda.”
  - Meaning / when to use: Rate parameter (or equivalently the mean) for the exponential distribution; use when events arrive randomly at a constant average rate — interarrival times are memoryless.
- Uniform: Used as a “first cut” when only Min and Max are known. Each value is equally likely.
- Triangular: Used when Min, Mode (Most Common), and Max are known. Good approximation for Normal when data is limited.
- Normal: Symmetric; models the sum of many independent subprocesses. Requires care to avoid negative values in simulation.
- Weibull: Versatile; often used for service times or failure rates that cannot be negative.
Common Discrete Distributions:
- Poisson: Models the number of events in a fixed interval of time.
- Bernoulli: Models two outcomes (e.g., Pass/Fail, Success/Failure).
- Geometric: Models the number of failures before the first success (e.g., batch sizes).
Empirical Distributions: Used when data cannot be fitted to a theoretical distribution. The model pulls directly from the observed data frequencies.

Input Data Distributions

Definition

Why It Matters

Core Concepts

Connected Concepts

Connected notes