Service Time Distributions in SimPy: Choosing the Right Randomness
Service times—how long things take—are rarely fixed. Choosing the right distribution makes your simulation realistic or useless.
Why Distribution Matters
Fixed service time: Everyone served in exactly 5 minutes. Reality: Some quick, some slow, most somewhere in between.
The distribution you choose shapes: - Queue behaviour - Waiting times - System capacity - Your conclusions
The Common Distributions
Exponential (Memoryless)
import random
service_time = random.expovariate(1/5) # Mean = 5
Properties: - Most likely value is 0 - Long tail (occasional very long times) - "Memoryless" - time already spent doesn't predict time remaining
Use when: - Process can be interrupted and resumed - Duration is inherently unpredictable - Classic M/M/1 queue theory
Avoid when: - There's a minimum time required - Process is structured (steps that must be completed)
Uniform
service_time = random.uniform(3, 8) # Between 3 and 8
Properties: - All values equally likely - Hard bounds (never below min, never above max) - No "typical" value
Use when: - You know the range but not the shape - Quick-and-dirty modelling - Bounded variation makes sense
Avoid when: - You have real data (it's rarely uniform) - Some values should be more likely than others
Normal (Gaussian)
service_time = max(0, random.gauss(5, 1)) # Mean 5, std 1
Properties: - Bell curve - Most values near the mean - Symmetric tails
Use when: - Natural variation around a central value - Process with many small random factors
Caution:
- Can produce negative values (use max(0, ...))
- Unbounded tails may not match reality
Triangular
service_time = random.triangular(2, 10, 5) # min, max, mode
Properties: - Bounded (between min and max) - Peak at the mode - Asymmetric if you want
Use when: - You can estimate min, max, and most likely - Expert judgement available - Don't have data for a precise fit
Lognormal
import numpy as np
rng = np.random.default_rng()
service_time = rng.lognormal(mean=1, sigma=0.5)
Properties: - Always positive - Right-skewed (long tail of high values) - Models multiplicative processes
Use when: - Many factors multiply together - Occasional very long times - Human task completion times
Gamma / Erlang
service_time = random.gammavariate(alpha=2, beta=2.5) # shape, scale
Properties: - Always positive - Flexible shape (adjust alpha) - Erlang is gamma with integer alpha
Use when: - Sum of multiple exponential stages - Structured processes (multiple steps) - More control than exponential
Weibull
service_time = random.weibullvariate(alpha=5, beta=1.5) # scale, shape
Properties: - Flexible failure-time distribution - Can model increasing, decreasing, or constant hazard
Use when: - Reliability modelling - Time-to-failure - Lifetime distributions
Distribution Comparison
| Distribution | Min | Shape | Best For |
|---|---|---|---|
| Exponential | 0 | Steep decay | Memoryless processes |
| Uniform | Custom | Flat | Unknown shape |
| Normal | -∞ | Bell | Natural variation |
| Triangular | Custom | Triangle | Expert estimates |
| Lognormal | 0 | Right-skewed | Human tasks |
| Gamma | 0 | Flexible | Multi-stage processes |
Practical Examples
Fast Food Service
def order_time():
"""Time to take order - mostly quick, sometimes slow"""
return random.triangular(0.5, 3, 1)
def preparation_time(complexity):
"""Preparation varies by order complexity"""
base_time = 2 + complexity * 1.5
return random.lognormal(np.log(base_time), 0.3)
Manufacturing
def machine_process_time():
"""Machine cycle - structured, predictable"""
return random.gammavariate(alpha=4, beta=1.25) # Mean ~5
def setup_time():
"""Setup - mostly quick but occasional issues"""
return random.lognormal(1, 0.5)
Healthcare
def consultation_time(patient_type):
"""Doctor consultation"""
base_times = {'routine': 10, 'complex': 25, 'emergency': 15}
base = base_times[patient_type]
return max(5, random.gauss(base, base * 0.2))
Fitting to Data
If you have real data:
from scipy import stats
# Fit various distributions
data = [...] # Your service time observations
distributions = {
'expon': stats.expon,
'norm': stats.norm,
'lognorm': stats.lognorm,
'gamma': stats.gamma
}
for name, dist in distributions.items():
params = dist.fit(data)
# Compare with Kolmogorov-Smirnov test
ks_stat, p_value = stats.kstest(data, dist.cdf, params)
print(f"{name}: KS stat = {ks_stat:.4f}, p = {p_value:.4f}")
Use the best-fitting distribution.
Using NumPy vs Random
The random module is fine for simple cases:
import random
random.expovariate(1/5)
random.gauss(10, 2)
NumPy offers more distributions and better control:
import numpy as np
rng = np.random.default_rng(seed=42)
rng.exponential(5)
rng.normal(10, 2)
rng.gamma(2, 2.5)
rng.lognormal(1, 0.5)
Deterministic Service
Sometimes variability isn't the point:
def service(env, resource):
with resource.request() as req:
yield req
yield env.timeout(5) # Always 5
Use fixed times when: - Variability isn't your research question - Process is truly deterministic - Simplifying for initial testing
Summary
Choosing a distribution: 1. Know your data - Fit if you have it 2. Know your process - Memoryless? Structured? Bounded? 3. Know your purpose - Does the shape matter for your question?
The wrong distribution gives wrong answers. The right one reveals truth.
Next Steps
Discover the Power of Simulation
Want to become a go-to expert in simulation with Python? The Complete Simulation Bootcamp will show you how simulation can transform your career and your projects.
Explore the Bootcamp