Confidence Intervals in SimPy: Quantifying Uncertainty

<p>A point estimate without a confidence interval is a guess with false precision. Here's how to do it properly.</p> <h2 id="whats-a-confidence-interval">What's a Confidence Interval?</h2> <p>A 95% confidence interval means: if you ran this analysis 100 times with different random seeds, about 95 of those intervals would contain the true mean.</p> <p>It's not "95% probability the true value is in this range." That's a common misconception.</p> <h2 id="basic-calculation">Basic Calculation</h2> <div class="code-block"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> <span class="kn">from</span><span class="w"> </span><span class="nn">scipy</span><span class="w"> </span><span class="kn">import</span> <span class="n">stats</span> <span class="k">def</span><span class="w"> </span><span class="nf">confidence_interval</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;Calculate confidence interval for the mean.&quot;&quot;&quot;</span> <span class="n">n</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="n">mean</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="n">se</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">sem</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="c1"># Standard error</span> <span class="n">t_value</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">t</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">,</span> <span class="n">df</span><span class="o">=</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="n">margin</span> <span class="o">=</span> <span class="n">t_value</span> <span class="o">*</span> <span class="n">se</span> <span class="k">return</span> <span class="p">{</span> <span class="s1">&#39;mean&#39;</span><span class="p">:</span> <span class="n">mean</span><span class="p">,</span> <span class="s1">&#39;lower&#39;</span><span class="p">:</span> <span class="n">mean</span> <span class="o">-</span> <span class="n">margin</span><span class="p">,</span> <span class="s1">&#39;upper&#39;</span><span class="p">:</span> <span class="n">mean</span> <span class="o">+</span> <span class="n">margin</span><span class="p">,</span> <span class="s1">&#39;margin&#39;</span><span class="p">:</span> <span class="n">margin</span><span class="p">,</span> <span class="s1">&#39;n&#39;</span><span class="p">:</span> <span class="n">n</span> <span class="p">}</span> <span class="c1"># Usage</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="mf">4.2</span><span class="p">,</span> <span class="mf">5.1</span><span class="p">,</span> <span class="mf">3.8</span><span class="p">,</span> <span class="mf">4.5</span><span class="p">,</span> <span class="mf">5.2</span><span class="p">,</span> <span class="mf">4.1</span><span class="p">,</span> <span class="mf">4.8</span><span class="p">,</span> <span class="mf">3.9</span><span class="p">,</span> <span class="mf">4.6</span><span class="p">,</span> <span class="mf">5.0</span><span class="p">]</span> <span class="n">ci</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">results</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Mean: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;95% CI: [</span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;lower&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;upper&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">]&quot;</span><span class="p">)</span> </code></pre></div> <h2 id="from-multiple-replications">From Multiple Replications</h2> <p>The standard workflow:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">run_replication</span><span class="p">(</span><span class="n">seed</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;Run one simulation replication, return mean wait time.&quot;&quot;&quot;</span> <span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> <span class="n">env</span> <span class="o">=</span> <span class="n">simpy</span><span class="o">.</span><span class="n">Environment</span><span class="p">()</span> <span class="c1"># ... run simulation</span> <span class="k">return</span> <span class="n">mean_wait_time</span> <span class="c1"># Run multiple replications</span> <span class="n">n_reps</span> <span class="o">=</span> <span class="mi">30</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_reps</span><span class="p">)]</span> <span class="c1"># Calculate CI</span> <span class="n">ci</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">results</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Mean wait time: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2"> ± </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;margin&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> </code></pre></div> <h2 id="choosing-sample-size">Choosing Sample Size</h2> <p>How many replications for a target half-width?</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">required_sample_size</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">target_half_width</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;Estimate required sample size for target precision.&quot;&quot;&quot;</span> <span class="n">n_pilot</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">)</span> <span class="n">s</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">std</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">ddof</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="c1"># Initial estimate using normal approximation</span> <span class="n">z</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="n">n_estimate</span> <span class="o">=</span> <span class="p">(</span><span class="n">z</span> <span class="o">*</span> <span class="n">s</span> <span class="o">/</span> <span class="n">target_half_width</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="c1"># Refine using t-distribution</span> <span class="n">t</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">t</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">,</span> <span class="n">df</span><span class="o">=</span><span class="nb">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">n_estimate</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">))</span> <span class="n">n_required</span> <span class="o">=</span> <span class="p">(</span><span class="n">t</span> <span class="o">*</span> <span class="n">s</span> <span class="o">/</span> <span class="n">target_half_width</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">ceil</span><span class="p">(</span><span class="n">n_required</span><span class="p">))</span> <span class="c1"># Pilot study with 10 runs</span> <span class="n">pilot</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">)]</span> <span class="n">needed</span> <span class="o">=</span> <span class="n">required_sample_size</span><span class="p">(</span><span class="n">pilot</span><span class="p">,</span> <span class="n">target_half_width</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Need </span><span class="si">{</span><span class="n">needed</span><span class="si">}</span><span class="s2"> replications for half-width of 0.5&quot;</span><span class="p">)</span> </code></pre></div> <h2 id="relative-precision">Relative Precision</h2> <p>Sometimes you want a percentage precision:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">required_for_relative_precision</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">target_relative_error</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;Sample size for target relative half-width (e.g., 5% of mean).&quot;&quot;&quot;</span> <span class="n">cv</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">std</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">ddof</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">)</span> <span class="c1"># Coefficient of variation</span> <span class="n">z</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="n">n_required</span> <span class="o">=</span> <span class="p">(</span><span class="n">z</span> <span class="o">*</span> <span class="n">cv</span> <span class="o">/</span> <span class="n">target_relative_error</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">ceil</span><span class="p">(</span><span class="n">n_required</span><span class="p">))</span> <span class="c1"># For 5% relative precision</span> <span class="n">needed</span> <span class="o">=</span> <span class="n">required_for_relative_precision</span><span class="p">(</span><span class="n">pilot</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">)</span> </code></pre></div> <h2 id="comparing-two-scenarios">Comparing Two Scenarios</h2> <p>Paired comparison with confidence interval:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">compare_scenarios</span><span class="p">(</span><span class="n">results_a</span><span class="p">,</span> <span class="n">results_b</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;</span> <span class="sd"> Paired comparison of two scenarios.</span> <span class="sd"> Returns CI for the difference (A - B).</span> <span class="sd"> &quot;&quot;&quot;</span> <span class="n">differences</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">results_a</span><span class="p">)</span> <span class="o">-</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">results_b</span><span class="p">)</span> <span class="n">ci</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">differences</span><span class="p">,</span> <span class="n">confidence</span><span class="p">)</span> <span class="k">return</span> <span class="p">{</span> <span class="s1">&#39;mean_diff&#39;</span><span class="p">:</span> <span class="n">ci</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">],</span> <span class="s1">&#39;ci_lower&#39;</span><span class="p">:</span> <span class="n">ci</span><span class="p">[</span><span class="s1">&#39;lower&#39;</span><span class="p">],</span> <span class="s1">&#39;ci_upper&#39;</span><span class="p">:</span> <span class="n">ci</span><span class="p">[</span><span class="s1">&#39;upper&#39;</span><span class="p">],</span> <span class="s1">&#39;significant&#39;</span><span class="p">:</span> <span class="p">(</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;lower&#39;</span><span class="p">]</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span> <span class="ow">or</span> <span class="p">(</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;upper&#39;</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">}</span> <span class="c1"># Run both scenarios with same seeds (paired)</span> <span class="n">results_2_servers</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">,</span> <span class="n">servers</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">)]</span> <span class="n">results_3_servers</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">,</span> <span class="n">servers</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">)]</span> <span class="n">comparison</span> <span class="o">=</span> <span class="n">compare_scenarios</span><span class="p">(</span><span class="n">results_2_servers</span><span class="p">,</span> <span class="n">results_3_servers</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Difference: </span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">&#39;mean_diff&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;95% CI: [</span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">&#39;ci_lower&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">&#39;ci_upper&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">]&quot;</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Significant: </span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">&#39;significant&#39;</span><span class="p">]</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span> </code></pre></div> <h2 id="multiple-metrics">Multiple Metrics</h2> <div class="code-block"><pre><span></span><code><span class="k">class</span><span class="w"> </span><span class="nc">SimulationResults</span><span class="p">:</span> <span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">def</span><span class="w"> </span><span class="nf">add_replication</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">metrics</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;metrics = {&#39;wait&#39;: 4.2, &#39;util&#39;: 0.85, &#39;throughput&#39;: 12.5}&quot;&quot;&quot;</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">metrics</span><span class="p">)</span> <span class="k">def</span><span class="w"> </span><span class="nf">confidence_intervals</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;Calculate CIs for all metrics.&quot;&quot;&quot;</span> <span class="n">results</span> <span class="o">=</span> <span class="p">{}</span> <span class="n">metrics</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span> <span class="k">for</span> <span class="n">metric</span> <span class="ow">in</span> <span class="n">metrics</span><span class="p">:</span> <span class="n">values</span> <span class="o">=</span> <span class="p">[</span><span class="n">r</span><span class="p">[</span><span class="n">metric</span><span class="p">]</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span><span class="p">]</span> <span class="n">results</span><span class="p">[</span><span class="n">metric</span><span class="p">]</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">values</span><span class="p">,</span> <span class="n">confidence</span><span class="p">)</span> <span class="k">return</span> <span class="n">results</span> <span class="c1"># Usage</span> <span class="n">results</span> <span class="o">=</span> <span class="n">SimulationResults</span><span class="p">()</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">):</span> <span class="n">metrics</span> <span class="o">=</span> <span class="n">run_replication_full</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="n">results</span><span class="o">.</span><span class="n">add_replication</span><span class="p">(</span><span class="n">metrics</span><span class="p">)</span> <span class="n">cis</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="n">confidence_intervals</span><span class="p">()</span> <span class="k">for</span> <span class="n">metric</span><span class="p">,</span> <span class="n">ci</span> <span class="ow">in</span> <span class="n">cis</span><span class="o">.</span><span class="n">items</span><span class="p">():</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">metric</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2"> [</span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;lower&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;upper&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">]&quot;</span><span class="p">)</span> </code></pre></div> <h2 id="visualising-confidence-intervals">Visualising Confidence Intervals</h2> <div class="code-block"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">matplotlib.pyplot</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">plt</span> <span class="k">def</span><span class="w"> </span><span class="nf">plot_ci_comparison</span><span class="p">(</span><span class="n">scenarios</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;</span> <span class="sd"> scenarios = {</span> <span class="sd"> &#39;Baseline&#39;: {&#39;mean&#39;: 5.2, &#39;lower&#39;: 4.8, &#39;upper&#39;: 5.6},</span> <span class="sd"> &#39;Improved&#39;: {&#39;mean&#39;: 3.1, &#39;lower&#39;: 2.9, &#39;upper&#39;: 3.3}</span> <span class="sd"> }</span> <span class="sd"> &quot;&quot;&quot;</span> <span class="n">names</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">scenarios</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span> <span class="n">means</span> <span class="o">=</span> <span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">scenarios</span><span class="o">.</span><span class="n">values</span><span class="p">()]</span> <span class="n">errors</span> <span class="o">=</span> <span class="p">[[</span><span class="n">s</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">]</span> <span class="o">-</span> <span class="n">s</span><span class="p">[</span><span class="s1">&#39;lower&#39;</span><span class="p">],</span> <span class="n">s</span><span class="p">[</span><span class="s1">&#39;upper&#39;</span><span class="p">]</span> <span class="o">-</span> <span class="n">s</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">]]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">scenarios</span><span class="o">.</span><span class="n">values</span><span class="p">()]</span> <span class="n">errors</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">errors</span><span class="p">)</span><span class="o">.</span><span class="n">T</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">bar</span><span class="p">(</span><span class="n">names</span><span class="p">,</span> <span class="n">means</span><span class="p">,</span> <span class="n">yerr</span><span class="o">=</span><span class="n">errors</span><span class="p">,</span> <span class="n">capsize</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s1">&#39;steelblue&#39;</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.7</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s1">&#39;Mean Wait Time&#39;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s1">&#39;Scenario Comparison with 95% Confidence Intervals&#39;</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <h2 id="bootstrap-confidence-intervals">Bootstrap Confidence Intervals</h2> <p>When normality assumptions don't hold:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">bootstrap_ci</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">n_bootstrap</span><span class="o">=</span><span class="mi">10000</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;Calculate bootstrap confidence interval.&quot;&quot;&quot;</span> <span class="n">bootstrapped_means</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_bootstrap</span><span class="p">):</span> <span class="n">sample</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">choice</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">),</span> <span class="n">replace</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span> <span class="n">bootstrapped_means</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">sample</span><span class="p">))</span> <span class="n">alpha</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span> <span class="n">lower</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">percentile</span><span class="p">(</span><span class="n">bootstrapped_means</span><span class="p">,</span> <span class="n">alpha</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="n">upper</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">percentile</span><span class="p">(</span><span class="n">bootstrapped_means</span><span class="p">,</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">alpha</span><span class="p">)</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="k">return</span> <span class="p">{</span> <span class="s1">&#39;mean&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">data</span><span class="p">),</span> <span class="s1">&#39;lower&#39;</span><span class="p">:</span> <span class="n">lower</span><span class="p">,</span> <span class="s1">&#39;upper&#39;</span><span class="p">:</span> <span class="n">upper</span> <span class="p">}</span> </code></pre></div> <h2 id="reporting-results">Reporting Results</h2> <p>Good reporting includes:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">format_result</span><span class="p">(</span><span class="n">ci</span><span class="p">,</span> <span class="n">metric_name</span><span class="p">,</span> <span class="n">decimals</span><span class="o">=</span><span class="mi">2</span><span class="p">):</span> <span class="w"> </span><span class="sd">&quot;&quot;&quot;Format a confidence interval for reporting.&quot;&quot;&quot;</span> <span class="k">return</span> <span class="p">(</span> <span class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span class="n">metric_name</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;mean&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.</span><span class="si">{</span><span class="n">decimals</span><span class="si">}</span><span class="s2">f</span><span class="si">}</span><span class="s2"> &quot;</span> <span class="sa">f</span><span class="s2">&quot;(95% CI: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;lower&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.</span><span class="si">{</span><span class="n">decimals</span><span class="si">}</span><span class="s2">f</span><span class="si">}</span><span class="s2"> - </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;upper&#39;</span><span class="p">]</span><span class="si">:</span><span class="s2">.</span><span class="si">{</span><span class="n">decimals</span><span class="si">}</span><span class="s2">f</span><span class="si">}</span><span class="s2">, &quot;</span> <span class="sa">f</span><span class="s2">&quot;n=</span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">&#39;n&#39;</span><span class="p">]</span><span class="si">}</span><span class="s2">)&quot;</span> <span class="p">)</span> <span class="c1"># Example output:</span> <span class="c1"># Mean wait time: 4.52 (95% CI: 4.21 - 4.83, n=30)</span> </code></pre></div> <h2 id="common-mistakes">Common Mistakes</h2> <p><strong>Too few replications:</strong></p> <div class="code-block"><pre><span></span><code><span class="c1"># Bad: CI is meaninglessly wide</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">3</span><span class="p">)]</span> <span class="c1"># Good: CI has reasonable precision</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">)]</span> </code></pre></div> <p><strong>Ignoring warm-up:</strong></p> <div class="code-block"><pre><span></span><code><span class="c1"># Bad: includes initialisation bias</span> <span class="n">mean_wait</span> <span class="o">=</span> <span class="n">simulation_run</span><span class="p">()</span> <span class="c1"># Good: excludes warm-up period</span> <span class="n">mean_wait</span> <span class="o">=</span> <span class="n">simulation_run_with_warmup</span><span class="p">(</span><span class="n">warmup</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span> </code></pre></div> <p><strong>Reporting without CI:</strong></p> <div class="code-block"><pre><span></span><code><span class="c1"># Bad: false precision</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Mean wait: 4.523&quot;</span><span class="p">)</span> <span class="c1"># Good: honest uncertainty</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Mean wait: 4.5 (95% CI: 4.2-4.8)&quot;</span><span class="p">)</span> </code></pre></div> <h2 id="summary">Summary</h2> <p>Confidence intervals: - Quantify uncertainty in your estimates - Require multiple replications - Enable meaningful comparisons - Should always accompany point estimates</p> <p>A simulation result without a confidence interval is not entirely trustworthy. Always report your uncertainty.</p> <h2 id="next-steps">Next Steps</h2> <ul> <li><a href="/blog_posts/simpy-run-multiple-replications.html">Running Multiple Replications</a></li> <li><a href="/blog_posts/simpy-warm-up-period.html">Warm-Up Period</a></li> <li><a href="/blog_posts/simpy-visualize-results.html">Visualising Results</a></li> </ul>

Discover the Power of Simulation

Want to become a go-to expert in simulation with Python? The Complete Simulation Bootcamp will show you how simulation can transform your career and your projects.

Explore the Bootcamp