Confidence Intervals in SimPy: Quantifying Uncertainty
<p>A point estimate without a confidence interval is a guess with false precision. Here's how to do it properly.</p> <h2 id="whats-a-confidence-interval">What's a Confidence Interval?</h2> <p>A 95% confidence interval means: if you ran this analysis 100 times with different random seeds, about 95 of those intervals would contain the true mean.</p> <p>It's not "95% probability the true value is in this range." That's a common misconception.</p> <h2 id="basic-calculation">Basic Calculation</h2> <div class="code-block"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> <span class="kn">from</span><span class="w"> </span><span class="nn">scipy</span><span class="w"> </span><span class="kn">import</span> <span class="n">stats</span> <span class="k">def</span><span class="w"> </span><span class="nf">confidence_interval</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Calculate confidence interval for the mean."""</span> <span class="n">n</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="n">mean</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="n">se</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">sem</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="c1"># Standard error</span> <span class="n">t_value</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">t</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">,</span> <span class="n">df</span><span class="o">=</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="n">margin</span> <span class="o">=</span> <span class="n">t_value</span> <span class="o">*</span> <span class="n">se</span> <span class="k">return</span> <span class="p">{</span> <span class="s1">'mean'</span><span class="p">:</span> <span class="n">mean</span><span class="p">,</span> <span class="s1">'lower'</span><span class="p">:</span> <span class="n">mean</span> <span class="o">-</span> <span class="n">margin</span><span class="p">,</span> <span class="s1">'upper'</span><span class="p">:</span> <span class="n">mean</span> <span class="o">+</span> <span class="n">margin</span><span class="p">,</span> <span class="s1">'margin'</span><span class="p">:</span> <span class="n">margin</span><span class="p">,</span> <span class="s1">'n'</span><span class="p">:</span> <span class="n">n</span> <span class="p">}</span> <span class="c1"># Usage</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="mf">4.2</span><span class="p">,</span> <span class="mf">5.1</span><span class="p">,</span> <span class="mf">3.8</span><span class="p">,</span> <span class="mf">4.5</span><span class="p">,</span> <span class="mf">5.2</span><span class="p">,</span> <span class="mf">4.1</span><span class="p">,</span> <span class="mf">4.8</span><span class="p">,</span> <span class="mf">3.9</span><span class="p">,</span> <span class="mf">4.6</span><span class="p">,</span> <span class="mf">5.0</span><span class="p">]</span> <span class="n">ci</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">results</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Mean: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"95% CI: [</span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'lower'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'upper'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">]"</span><span class="p">)</span> </code></pre></div> <h2 id="from-multiple-replications">From Multiple Replications</h2> <p>The standard workflow:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">run_replication</span><span class="p">(</span><span class="n">seed</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Run one simulation replication, return mean wait time."""</span> <span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> <span class="n">env</span> <span class="o">=</span> <span class="n">simpy</span><span class="o">.</span><span class="n">Environment</span><span class="p">()</span> <span class="c1"># ... run simulation</span> <span class="k">return</span> <span class="n">mean_wait_time</span> <span class="c1"># Run multiple replications</span> <span class="n">n_reps</span> <span class="o">=</span> <span class="mi">30</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_reps</span><span class="p">)]</span> <span class="c1"># Calculate CI</span> <span class="n">ci</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">results</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Mean wait time: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2"> ± </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'margin'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> </code></pre></div> <h2 id="choosing-sample-size">Choosing Sample Size</h2> <p>How many replications for a target half-width?</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">required_sample_size</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">target_half_width</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Estimate required sample size for target precision."""</span> <span class="n">n_pilot</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">)</span> <span class="n">s</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">std</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">ddof</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="c1"># Initial estimate using normal approximation</span> <span class="n">z</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="n">n_estimate</span> <span class="o">=</span> <span class="p">(</span><span class="n">z</span> <span class="o">*</span> <span class="n">s</span> <span class="o">/</span> <span class="n">target_half_width</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="c1"># Refine using t-distribution</span> <span class="n">t</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">t</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">,</span> <span class="n">df</span><span class="o">=</span><span class="nb">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">int</span><span class="p">(</span><span class="n">n_estimate</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">))</span> <span class="n">n_required</span> <span class="o">=</span> <span class="p">(</span><span class="n">t</span> <span class="o">*</span> <span class="n">s</span> <span class="o">/</span> <span class="n">target_half_width</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">ceil</span><span class="p">(</span><span class="n">n_required</span><span class="p">))</span> <span class="c1"># Pilot study with 10 runs</span> <span class="n">pilot</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">)]</span> <span class="n">needed</span> <span class="o">=</span> <span class="n">required_sample_size</span><span class="p">(</span><span class="n">pilot</span><span class="p">,</span> <span class="n">target_half_width</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Need </span><span class="si">{</span><span class="n">needed</span><span class="si">}</span><span class="s2"> replications for half-width of 0.5"</span><span class="p">)</span> </code></pre></div> <h2 id="relative-precision">Relative Precision</h2> <p>Sometimes you want a percentage precision:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">required_for_relative_precision</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">target_relative_error</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Sample size for target relative half-width (e.g., 5% of mean)."""</span> <span class="n">cv</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">std</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">,</span> <span class="n">ddof</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">pilot_data</span><span class="p">)</span> <span class="c1"># Coefficient of variation</span> <span class="n">z</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">+</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span> <span class="n">n_required</span> <span class="o">=</span> <span class="p">(</span><span class="n">z</span> <span class="o">*</span> <span class="n">cv</span> <span class="o">/</span> <span class="n">target_relative_error</span><span class="p">)</span> <span class="o">**</span> <span class="mi">2</span> <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">ceil</span><span class="p">(</span><span class="n">n_required</span><span class="p">))</span> <span class="c1"># For 5% relative precision</span> <span class="n">needed</span> <span class="o">=</span> <span class="n">required_for_relative_precision</span><span class="p">(</span><span class="n">pilot</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">)</span> </code></pre></div> <h2 id="comparing-two-scenarios">Comparing Two Scenarios</h2> <p>Paired comparison with confidence interval:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">compare_scenarios</span><span class="p">(</span><span class="n">results_a</span><span class="p">,</span> <span class="n">results_b</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""</span> <span class="sd"> Paired comparison of two scenarios.</span> <span class="sd"> Returns CI for the difference (A - B).</span> <span class="sd"> """</span> <span class="n">differences</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">results_a</span><span class="p">)</span> <span class="o">-</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">results_b</span><span class="p">)</span> <span class="n">ci</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">differences</span><span class="p">,</span> <span class="n">confidence</span><span class="p">)</span> <span class="k">return</span> <span class="p">{</span> <span class="s1">'mean_diff'</span><span class="p">:</span> <span class="n">ci</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">],</span> <span class="s1">'ci_lower'</span><span class="p">:</span> <span class="n">ci</span><span class="p">[</span><span class="s1">'lower'</span><span class="p">],</span> <span class="s1">'ci_upper'</span><span class="p">:</span> <span class="n">ci</span><span class="p">[</span><span class="s1">'upper'</span><span class="p">],</span> <span class="s1">'significant'</span><span class="p">:</span> <span class="p">(</span><span class="n">ci</span><span class="p">[</span><span class="s1">'lower'</span><span class="p">]</span> <span class="o">></span> <span class="mi">0</span><span class="p">)</span> <span class="ow">or</span> <span class="p">(</span><span class="n">ci</span><span class="p">[</span><span class="s1">'upper'</span><span class="p">]</span> <span class="o"><</span> <span class="mi">0</span><span class="p">)</span> <span class="p">}</span> <span class="c1"># Run both scenarios with same seeds (paired)</span> <span class="n">results_2_servers</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">,</span> <span class="n">servers</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">)]</span> <span class="n">results_3_servers</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">,</span> <span class="n">servers</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">)]</span> <span class="n">comparison</span> <span class="o">=</span> <span class="n">compare_scenarios</span><span class="p">(</span><span class="n">results_2_servers</span><span class="p">,</span> <span class="n">results_3_servers</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Difference: </span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">'mean_diff'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"95% CI: [</span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">'ci_lower'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">'ci_upper'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">]"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Significant: </span><span class="si">{</span><span class="n">comparison</span><span class="p">[</span><span class="s1">'significant'</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> </code></pre></div> <h2 id="multiple-metrics">Multiple Metrics</h2> <div class="code-block"><pre><span></span><code><span class="k">class</span><span class="w"> </span><span class="nc">SimulationResults</span><span class="p">:</span> <span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">def</span><span class="w"> </span><span class="nf">add_replication</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">metrics</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""metrics = {'wait': 4.2, 'util': 0.85, 'throughput': 12.5}"""</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">metrics</span><span class="p">)</span> <span class="k">def</span><span class="w"> </span><span class="nf">confidence_intervals</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Calculate CIs for all metrics."""</span> <span class="n">results</span> <span class="o">=</span> <span class="p">{}</span> <span class="n">metrics</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span> <span class="k">for</span> <span class="n">metric</span> <span class="ow">in</span> <span class="n">metrics</span><span class="p">:</span> <span class="n">values</span> <span class="o">=</span> <span class="p">[</span><span class="n">r</span><span class="p">[</span><span class="n">metric</span><span class="p">]</span> <span class="k">for</span> <span class="n">r</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">replications</span><span class="p">]</span> <span class="n">results</span><span class="p">[</span><span class="n">metric</span><span class="p">]</span> <span class="o">=</span> <span class="n">confidence_interval</span><span class="p">(</span><span class="n">values</span><span class="p">,</span> <span class="n">confidence</span><span class="p">)</span> <span class="k">return</span> <span class="n">results</span> <span class="c1"># Usage</span> <span class="n">results</span> <span class="o">=</span> <span class="n">SimulationResults</span><span class="p">()</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">):</span> <span class="n">metrics</span> <span class="o">=</span> <span class="n">run_replication_full</span><span class="p">(</span><span class="mi">42</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="n">results</span><span class="o">.</span><span class="n">add_replication</span><span class="p">(</span><span class="n">metrics</span><span class="p">)</span> <span class="n">cis</span> <span class="o">=</span> <span class="n">results</span><span class="o">.</span><span class="n">confidence_intervals</span><span class="p">()</span> <span class="k">for</span> <span class="n">metric</span><span class="p">,</span> <span class="n">ci</span> <span class="ow">in</span> <span class="n">cis</span><span class="o">.</span><span class="n">items</span><span class="p">():</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="n">metric</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2"> [</span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'lower'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">, </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'upper'</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">]"</span><span class="p">)</span> </code></pre></div> <h2 id="visualising-confidence-intervals">Visualising Confidence Intervals</h2> <div class="code-block"><pre><span></span><code><span class="kn">import</span><span class="w"> </span><span class="nn">matplotlib.pyplot</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">plt</span> <span class="k">def</span><span class="w"> </span><span class="nf">plot_ci_comparison</span><span class="p">(</span><span class="n">scenarios</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""</span> <span class="sd"> scenarios = {</span> <span class="sd"> 'Baseline': {'mean': 5.2, 'lower': 4.8, 'upper': 5.6},</span> <span class="sd"> 'Improved': {'mean': 3.1, 'lower': 2.9, 'upper': 3.3}</span> <span class="sd"> }</span> <span class="sd"> """</span> <span class="n">names</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">scenarios</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span> <span class="n">means</span> <span class="o">=</span> <span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">scenarios</span><span class="o">.</span><span class="n">values</span><span class="p">()]</span> <span class="n">errors</span> <span class="o">=</span> <span class="p">[[</span><span class="n">s</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">]</span> <span class="o">-</span> <span class="n">s</span><span class="p">[</span><span class="s1">'lower'</span><span class="p">],</span> <span class="n">s</span><span class="p">[</span><span class="s1">'upper'</span><span class="p">]</span> <span class="o">-</span> <span class="n">s</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">]]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">scenarios</span><span class="o">.</span><span class="n">values</span><span class="p">()]</span> <span class="n">errors</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">errors</span><span class="p">)</span><span class="o">.</span><span class="n">T</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">bar</span><span class="p">(</span><span class="n">names</span><span class="p">,</span> <span class="n">means</span><span class="p">,</span> <span class="n">yerr</span><span class="o">=</span><span class="n">errors</span><span class="p">,</span> <span class="n">capsize</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s1">'steelblue'</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.7</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s1">'Mean Wait Time'</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">title</span><span class="p">(</span><span class="s1">'Scenario Comparison with 95% Confidence Intervals'</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <h2 id="bootstrap-confidence-intervals">Bootstrap Confidence Intervals</h2> <p>When normality assumptions don't hold:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">bootstrap_ci</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">n_bootstrap</span><span class="o">=</span><span class="mi">10000</span><span class="p">,</span> <span class="n">confidence</span><span class="o">=</span><span class="mf">0.95</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Calculate bootstrap confidence interval."""</span> <span class="n">bootstrapped_means</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_bootstrap</span><span class="p">):</span> <span class="n">sample</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">choice</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">),</span> <span class="n">replace</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span> <span class="n">bootstrapped_means</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">sample</span><span class="p">))</span> <span class="n">alpha</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">confidence</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span> <span class="n">lower</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">percentile</span><span class="p">(</span><span class="n">bootstrapped_means</span><span class="p">,</span> <span class="n">alpha</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="n">upper</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">percentile</span><span class="p">(</span><span class="n">bootstrapped_means</span><span class="p">,</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">alpha</span><span class="p">)</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="k">return</span> <span class="p">{</span> <span class="s1">'mean'</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">mean</span><span class="p">(</span><span class="n">data</span><span class="p">),</span> <span class="s1">'lower'</span><span class="p">:</span> <span class="n">lower</span><span class="p">,</span> <span class="s1">'upper'</span><span class="p">:</span> <span class="n">upper</span> <span class="p">}</span> </code></pre></div> <h2 id="reporting-results">Reporting Results</h2> <p>Good reporting includes:</p> <div class="code-block"><pre><span></span><code><span class="k">def</span><span class="w"> </span><span class="nf">format_result</span><span class="p">(</span><span class="n">ci</span><span class="p">,</span> <span class="n">metric_name</span><span class="p">,</span> <span class="n">decimals</span><span class="o">=</span><span class="mi">2</span><span class="p">):</span> <span class="w"> </span><span class="sd">"""Format a confidence interval for reporting."""</span> <span class="k">return</span> <span class="p">(</span> <span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="n">metric_name</span><span class="si">}</span><span class="s2">: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'mean'</span><span class="p">]</span><span class="si">:</span><span class="s2">.</span><span class="si">{</span><span class="n">decimals</span><span class="si">}</span><span class="s2">f</span><span class="si">}</span><span class="s2"> "</span> <span class="sa">f</span><span class="s2">"(95% CI: </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'lower'</span><span class="p">]</span><span class="si">:</span><span class="s2">.</span><span class="si">{</span><span class="n">decimals</span><span class="si">}</span><span class="s2">f</span><span class="si">}</span><span class="s2"> - </span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'upper'</span><span class="p">]</span><span class="si">:</span><span class="s2">.</span><span class="si">{</span><span class="n">decimals</span><span class="si">}</span><span class="s2">f</span><span class="si">}</span><span class="s2">, "</span> <span class="sa">f</span><span class="s2">"n=</span><span class="si">{</span><span class="n">ci</span><span class="p">[</span><span class="s1">'n'</span><span class="p">]</span><span class="si">}</span><span class="s2">)"</span> <span class="p">)</span> <span class="c1"># Example output:</span> <span class="c1"># Mean wait time: 4.52 (95% CI: 4.21 - 4.83, n=30)</span> </code></pre></div> <h2 id="common-mistakes">Common Mistakes</h2> <p><strong>Too few replications:</strong></p> <div class="code-block"><pre><span></span><code><span class="c1"># Bad: CI is meaninglessly wide</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">3</span><span class="p">)]</span> <span class="c1"># Good: CI has reasonable precision</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="n">run_replication</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">30</span><span class="p">)]</span> </code></pre></div> <p><strong>Ignoring warm-up:</strong></p> <div class="code-block"><pre><span></span><code><span class="c1"># Bad: includes initialisation bias</span> <span class="n">mean_wait</span> <span class="o">=</span> <span class="n">simulation_run</span><span class="p">()</span> <span class="c1"># Good: excludes warm-up period</span> <span class="n">mean_wait</span> <span class="o">=</span> <span class="n">simulation_run_with_warmup</span><span class="p">(</span><span class="n">warmup</span><span class="o">=</span><span class="mi">100</span><span class="p">)</span> </code></pre></div> <p><strong>Reporting without CI:</strong></p> <div class="code-block"><pre><span></span><code><span class="c1"># Bad: false precision</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Mean wait: 4.523"</span><span class="p">)</span> <span class="c1"># Good: honest uncertainty</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Mean wait: 4.5 (95% CI: 4.2-4.8)"</span><span class="p">)</span> </code></pre></div> <h2 id="summary">Summary</h2> <p>Confidence intervals: - Quantify uncertainty in your estimates - Require multiple replications - Enable meaningful comparisons - Should always accompany point estimates</p> <p>A simulation result without a confidence interval is not entirely trustworthy. Always report your uncertainty.</p> <h2 id="next-steps">Next Steps</h2> <ul> <li><a href="/blog_posts/simpy-run-multiple-replications.html">Running Multiple Replications</a></li> <li><a href="/blog_posts/simpy-warm-up-period.html">Warm-Up Period</a></li> <li><a href="/blog_posts/simpy-visualize-results.html">Visualising Results</a></li> </ul>Discover the Power of Simulation
Want to become a go-to expert in simulation with Python? The Complete Simulation Bootcamp will show you how simulation can transform your career and your projects.
Explore the Bootcamp