Random Numbers in SimPy: Seeds, Streams, and Reproducibility
Randomness is the soul of simulation. But uncontrolled randomness is chaos. Here's how to do it right.
The Reproducibility Problem
Run a simulation twice. Get different results. Which one's right?
Neither. Both. That's randomness.
Reproducibility means getting the same random results when you need them. Different results when comparing scenarios.
Basic Seeding
import random
random.seed(42) # Fixed seed
result1 = random.expovariate(1/5)
random.seed(42) # Same seed
result2 = random.expovariate(1/5)
print(result1 == result2) # True
Same seed = same sequence of random numbers.
Python's random Module
import random
random.seed(42)
# Common distributions
random.random() # Uniform [0, 1)
random.uniform(5, 10) # Uniform [5, 10]
random.expovariate(1/5) # Exponential, mean 5
random.gauss(10, 2) # Normal, mean 10, std 2
random.triangular(1, 10, 5) # Triangular
random.choice([1, 2, 3]) # Random selection
random.randint(1, 10) # Integer in range
NumPy's Random Generator (Recommended)
More distributions. Better control. Modern API:
import numpy as np
rng = np.random.default_rng(seed=42)
rng.random() # Uniform [0, 1)
rng.uniform(5, 10) # Uniform [5, 10]
rng.exponential(5) # Exponential, mean 5
rng.normal(10, 2) # Normal, mean 10, std 2
rng.gamma(2, 2.5) # Gamma
rng.lognormal(1, 0.5) # Lognormal
rng.poisson(5) # Poisson
rng.choice([1, 2, 3]) # Random selection
rng.integers(1, 10, endpoint=True) # Integer in range
Per-Process Random Streams
Best practice: each process type gets its own random stream:
class Simulation:
def __init__(self, seed):
base_rng = np.random.default_rng(seed)
# Generate seeds for sub-streams
seeds = base_rng.integers(0, 2**31, size=5)
self.arrival_rng = np.random.default_rng(seeds[0])
self.service_rng = np.random.default_rng(seeds[1])
self.breakdown_rng = np.random.default_rng(seeds[2])
def arrival_time(self):
return self.arrival_rng.exponential(5)
def service_time(self):
return self.service_rng.exponential(3)
Why separate streams?
- Isolation - Changing arrivals doesn't affect service randomness
- Comparison - Same arrivals, different service configurations
- Debugging - Reproducible sub-components
Common Random Numbers (CRN)
Compare scenarios fairly by using the same random numbers:
def run_scenario(env, num_servers, rng_seed):
rng = np.random.default_rng(rng_seed)
server = simpy.Resource(env, capacity=num_servers)
# ... run simulation
# Same random arrivals for both scenarios
results_2_servers = run_scenario(simpy.Environment(), 2, seed=42)
results_3_servers = run_scenario(simpy.Environment(), 3, seed=42)
Now the difference in results is due to server count, not random variation.
Independent Replications
For confidence intervals, use different seeds:
def run_replication(seed):
random.seed(seed)
env = simpy.Environment()
# ... run simulation
return mean_wait_time
# Multiple replications with different seeds
base_seed = 42
results = []
for i in range(30):
results.append(run_replication(base_seed + i))
mean = np.mean(results)
ci = 1.96 * np.std(results, ddof=1) / np.sqrt(len(results))
print(f"Mean: {mean:.2f} ± {ci:.2f}")
Global vs Local State
Global state (not recommended for complex simulations):
import random
random.seed(42) # Affects all random calls globally
Local state (recommended):
import numpy as np
rng = np.random.default_rng(42) # Only affects this generator
Reproducibility Checklist
- Always set a seed at the start
- Log your seed so you can reproduce bugs
- Use separate streams for different random processes
- Document your seeds in results
def run_simulation(config_seed=None):
if config_seed is None:
config_seed = int(time.time()) # Random if not specified
print(f"Running with seed: {config_seed}")
rng = np.random.default_rng(config_seed)
# ...
return results, config_seed
Anti-Patterns
Don't reseed mid-simulation
# Bad!
def customer(env):
random.seed(42) # Resets sequence!
wait = random.expovariate(1/5)
Don't use time-based seeds for comparisons
# Bad for comparison!
random.seed(int(time.time()))
Don't mix random modules carelessly
# Confusing - which seed controls what?
import random
import numpy as np
random.seed(42)
np.random.seed(42) # Different generators!
Testing Randomness
Sanity check your distributions:
rng = np.random.default_rng(42)
# Generate many samples
samples = [rng.exponential(5) for _ in range(10000)]
# Check mean (should be ~5)
print(f"Sample mean: {np.mean(samples):.2f}")
# Visual check
import matplotlib.pyplot as plt
plt.hist(samples, bins=50)
plt.show()
Advanced: Variance Reduction
Techniques to reduce variance without more replications:
Antithetic variates:
def generate_antithetic_pair(rng, mean):
u = rng.random()
return -mean * np.log(u), -mean * np.log(1 - u)
Common random numbers (already covered) - use same stream across scenarios.
Complete Example
import simpy
import numpy as np
class SimulationRNG:
"""Manages all random streams for a simulation."""
def __init__(self, master_seed):
master = np.random.default_rng(master_seed)
seeds = master.integers(0, 2**31, size=10)
self.arrival = np.random.default_rng(seeds[0])
self.service = np.random.default_rng(seeds[1])
self.failure = np.random.default_rng(seeds[2])
def run_simulation(master_seed, num_servers, run_time):
env = simpy.Environment()
rng = SimulationRNG(master_seed)
server = simpy.Resource(env, capacity=num_servers)
stats = []
def customer(name):
arrival = env.now
with server.request() as req:
yield req
wait = env.now - arrival
yield env.timeout(rng.service.exponential(5))
stats.append(wait)
def arrivals():
i = 0
while True:
yield env.timeout(rng.arrival.exponential(6))
env.process(customer(f"C{i}"))
i += 1
env.process(arrivals())
env.run(until=run_time)
return np.mean(stats) if stats else 0
# Run with reproducible seeds
for seed in [42, 43, 44]:
result = run_simulation(seed, num_servers=2, run_time=1000)
print(f"Seed {seed}: Mean wait = {result:.2f}")
Summary
Randomness in simulation:
- Always seed for reproducibility
- Use separate streams for isolation
- NumPy's default_rng is the modern approach
- Common random numbers for fair comparisons
- Log your seeds!
Random doesn't mean uncontrolled.
Next Steps
Discover the Power of Simulation
Want to become a go-to expert in simulation with Python? The Complete Simulation Bootcamp will show you how simulation can transform your career and your projects.
Explore the Bootcamp