Monday, 18 May 2026

SIMULATION EXPERIMENT LAB

SIMULATION TECHNIQUES: COMPLETE GUIDE

(Quantitative Techniques in Management)

TABLE OF CONTENTS

  1. Introduction to Simulation
  2. Fundamentals of Simulation
  3. Types of Simulation
  4. Monte Carlo Simulation (Detailed)
  5. Monte Carlo Example: Cake Demand Problem
  6. Inventory System Simulation (Detailed Example)
  7. Queuing System Simulation (Detailed Example)
  8. Important Formulae and Mathematical Concepts
  9. Advantages of Simulation
  10. Limitations of Simulation
  11. Applications of Simulation
  12. Step-by-Step Procedure for Solving Any Simulation Problem
  13. Quick Revision Checklist
  14. Final Conclusion

1. INTRODUCTION TO SIMULATION

Simulation is one of the most important quantitative decision-making techniques used in management, engineering, economics, operations research, healthcare, and business systems.

In real life, many systems contain uncertainty, risk, and randomness. Examples:

  • daily product demand changes,
  • machine breakdowns occur unexpectedly,
  • customers arrive randomly,
  • delivery time varies,
  • stock market prices fluctuate.

Because of this uncertainty, exact mathematical solutions often become difficult or impossible.

Simulation solves this problem by creating a virtual model of reality.


Definition of Simulation

Simulation is:

“A numerical and experimental method used to imitate the behavior of a real-world system over time.”

It allows us to test decisions before applying them in reality.

Simple Meaning

Instead of experimenting on the real system:

Example: A company wants to test:

  • What happens if inventory is reduced?
  • What happens if two attendants are hired?
  • What happens if reorder quantity changes?

Rather than risking money in real life, we simulate the system on paper or computer.

2. FUNDAMENTALS OF SIMULATION

Simulation is based on four major principles.

Principle 1: Model Building

A model is a simplified version of reality.

Example:

Real system: Factory inventory

Model: Opening stock – Demand + New supply

This becomes the simulation model.

Principle 2: Randomness

Many business variables are random:

  • customer arrivals,
  • machine failures,
  • market demand,
  • delivery lead time.

These random variables are represented using probability distributions.

Principle 3: Repeated Trials

One experiment is not enough.

Simulation repeats many trials:

10 times
100 times
1000 times

More trials = better accuracy.

Principle 4: Evaluation

After simulation:

  • compare policies,
  • calculate cost,
  • choose best alternative.

Four Phases of Simulation

Problem Definition
       ↓
Model Formulation
       ↓
Experimentation
       ↓
Evaluation

Phase 1: Problem Definition

Clearly define:

  • What is the problem?
  • What decision is needed?
  • What is the objective?

Example: Minimize inventory cost.

Phase 2: Model Formulation

Build equations and logic.

Example:

Phase 3: Experimentation

Run the model using random numbers.

Phase 4: Evaluation

Interpret results.

Example: Policy A cost = ₹2356
Policy B cost = ₹2730

Choose Policy A.

3. TYPES OF SIMULATION

1. Monte Carlo Simulation

Uses random numbers.

Used for:

  • inventory,
  • finance,
  • risk analysis.

2. System Simulation

Studies complete systems over time.

Example: Airport operations.

3. Discrete Event Simulation

Studies events occurring at separate times.

Example: Customer arrival at bank.

4. Continuous Simulation

Studies continuously changing systems.

Example: Temperature control.

4. MONTE CARLO SIMULATION (DETAILED)

Monte Carlo is the most widely used simulation technique.

Meaning

It uses:

  1. Probability distribution
  2. Random numbers
  3. Repeated experiments

to estimate future outcomes.

Why called Monte Carlo?

Named after Monte Carlo Casino because it relies on chance and randomness, similar to gambling.

Steps in Monte Carlo Simulation

Step 1: Identify variable

Example: Daily demand.

Step 2: Build probability distribution

Formula:

Step 3: Calculate cumulative probability

Formula:

Step 4: Assign random number intervals

Example:

00–09
10–29
30–59
60–99

Step 5: Generate random numbers

Example: 61, 24, 03, 92

Step 6: Map random numbers

Example: 61 → demand = 9

Step 7: Analyze output

Average demand, cost, shortage etc.

5. MONTE CARLO EXAMPLE: CAKE DEMAND

A bakery records demand for 200 days.

Historical Data

Demand Days
5 4
6 10
7 16
8 50
9 62
10 38
11 12
12 8

Total = 200

Step 1: Probability

Demand Probability
5 0.02
6 0.05
7 0.08
8 0.25
9 0.31
10 0.19
11 0.06
12 0.04

Step 2: Cumulative Probability

Demand Cumulative
5 0.02
6 0.07
7 0.15
8 0.40
9 0.71
10 0.90
11 0.96
12 1.00

Step 3: Random Number Assignment

Demand Interval
5 00–01
6 02–06
7 07–14
8 15–39
9 40–70
10 71–89
11 90–95
12 96–99

Step 4: Random Numbers

61, 74, 24, 03, 59, 16, 84, 92, 52, 07


Step 5: Simulated Demand

RN Demand
61 9
74 10
24 8
03 6
59 9
16 8
84 10
92 11
52 9
07 7

Average Demand

Interpretation: Average daily demand helps production planning.


6. INVENTORY SYSTEM SIMULATION


Objective

Find best:

  • reorder level
  • reorder quantity

to minimize total cost.

Given Data

Ordering cost = ₹80/order
Holding cost = ₹2/unit/day
Shortage cost = ₹20/unit/day

Beginning inventory = 30

Policy: ROL = 20
ROQ = 40

Simulation period = 40 days


Why Inventory Simulation?

Because both are uncertain:

  1. Demand changes daily
  2. Delivery lead time changes

Both affect inventory cost.


Demand Distribution

Demand Probability
3 0.02
4 0.08
5 0.11
6 0.16
7 0.19
8 0.13
9 0.10
10 0.08
11 0.07
12 0.06

Lead Time Distribution

Days Probability
2 0.20
3 0.30
4 0.35
5 0.15

Daily Simulation Logic


Day 1

Opening = 30

RN = 68
Demand = 8

Closing:

30 − 8 = 22

Holding cost: 22 × 2 = ₹44

No order.


Day 2

Opening = 22

RN = 13
Demand = 5

Closing:

22 − 5 = 17

Since 17 < 20

Place order = 40

RN lead = 47

Lead time = 3 days

Order arrives on Day 5.

Ordering cost = ₹80


Continue for all 40 days

Then total all costs.

Final Result

Ordering = ₹640
Holding = ₹1456
Shortage = ₹260

Total Cost

= ₹2356

Optimal policy: ROL = 20
ROQ = 40


7. QUEUING SYSTEM SIMULATION

Problem

Workshop mechanics wait for service.

Need to decide:

How many attendants should be hired?

Objective

Minimize:

  1. Waiting cost
  2. Wage cost

Arrival Distribution

Minutes Frequency
0 12
3 18
6 50
9 74
12 32
15 14

Total = 200

Convert to probability.

Procedure

  1. Generate arrivals
  2. Generate service time
  3. Calculate waiting
  4. Calculate idle time
  5. Calculate total cost
  6. Compare 1, 2, 3 attendants

Choose minimum cost option.

8. IMPORTANT FORMULAE

Probability:

Average:

Inventory:

Holding Cost:

Shortage Cost:

Total Cost:

9. ADVANTAGES OF SIMULATION

  1. Handles uncertainty well
  2. Flexible
  3. Safe experimentation
  4. Saves money
  5. Helps decision making
  6. Useful in complex systems
  7. Risk reduction

10. LIMITATIONS

  1. Time consuming
  2. Requires data
  3. Needs expertise
  4. Approximate answers only
  5. No guaranteed optimum

11. APPLICATIONS

Simulation is used in:

  • hospitals
  • inventory
  • airlines
  • traffic
  • manufacturing
  • stock market
  • military planning
  • supply chain
  • banking
  • project management

12. STANDARD PROCEDURE FOR ANY SIMULATION PROBLEM

Step 1: Define objective
Step 2: Identify random variables
Step 3: Build probability table
Step 4: Create cumulative probability
Step 5: Assign intervals
Step 6: Generate random numbers
Step 7: Perform simulation
Step 8: Calculate results
Step 9: Compare policies
Step 10: Select best option

13. QUICK REVISION CHECKLIST

☑ Objective clear
☑ Probability table ready
☑ Random intervals assigned
☑ Random numbers selected
☑ Simulation table prepared
☑ Costs calculated
☑ Policies compared
☑ Final decision made

14. FINAL CONCLUSION

Simulation is a powerful decision-making tool used when systems involve uncertainty and risk.

It helps managers answer:

  • What may happen?
  • What is best policy?
  • What is lowest cost?
  • What is safest decision?

Main idea: “Test on model first, implement later in reality.”

Sub section 1.2

SIMULATION EXPERIMENT / LABORATORY MANUAL

Study and Analysis 

1. Introduction

Simulation is a powerful scientific and computational technique used to imitate the behavior of a real-world system over time through a virtual or mathematical model.

It enables engineers, researchers, and decision-makers to:

  • study system performance,
  • test alternative strategies,
  • predict outcomes,
  • and optimize decisions without disturbing the real system.

Modern engineering and management systems often involve:

  • multiple interacting variables,
  • uncertainty,
  • dynamic changes,
  • and operational complexity.

Traditional analytical methods may fail to solve such problems effectively. Simulation overcomes this limitation by creating a controlled virtual environment for experimentation and analysis.

Simulation is especially useful when:

  • real-world testing is expensive,
  • physical experimentation is unsafe,
  • systems are too complex for exact solutions,
  • “what-if” analysis is needed before implementation.

Common Applications

  • manufacturing systems
  • healthcare systems
  • transportation networks
  • inventory control
  • project management
  • financial risk analysis

2. Basic Simulation Framework

Problem Identification

System Definition & Boundary Setting

Model Development

Data Collection & Input Analysis

Random Variable Generation

Model Execution (Simulation Run)

Output Analysis & Interpretation

Validation & Decision Making

3. General Mathematical Representation

Where:

  • Input = known system data
  • Model = logical/mathematical structure
  • Randomness = uncertainty or stochastic behavior
  • Time = dynamic system evolution

This shows that output depends on both deterministic and probabilistic factors.

4. Definition of Simulation

Standard Definition

Simulation is the imitation of the operation of a real-world process or system over time using a model to evaluate performance and support decisions.

Technical Definition

Simulation is a computer-based experimental method used to study dynamic systems under varying assumptions and conditions.

Simple Definition

Simulation means testing ideas virtually before applying them in reality.

Mathematical Definition

Where:

  • S(t) = system state at time t
  • X(t) = time-dependent inputs
  • P = fixed parameters
  • R = random effects

5. Aim

To study, model, and analyze real-world systems using simulation techniques for:

  • prediction,
  • optimization,
  • risk reduction,
  • evidence-based decision making.

Formula:

Simulation Model = f(System, Inputs, Randomness)

6. Objectives

  1. Understand simulation principles and assumptions.
  2. Convert real systems into mathematical/logical models.
  3. Analyze uncertainty using probability distributions.
  4. Perform sensitivity and what-if analysis.
  5. Optimize cost, time, quality, and resources.
  6. Verify and validate models.
  7. Support engineering and management decisions.

Optimization Principle:

7. Mission

To develop:

  • scientific thinking,
  • analytical capability,
  • problem-solving competence,

through simulation techniques for reliable and data-driven decisions.

Mission Focus

  • operational excellence
  • digital transformation
  • sustainability
  • risk reduction
  • continuous improvement

8. Vision

To establish simulation as a foundation for future intelligent systems.

Strategic Areas

  • smart manufacturing
  • digital twins
  • Industry 4.0
  • AI integration
  • autonomous systems
  • IoT
  • sustainable engineering

Future Formula:

Future Simulation = AI + Big Data + Digital Twin + Automation

9. Key Characteristics of Simulation

  1. Dynamic – time-dependent behavior
  2. Probabilistic – includes uncertainty
  3. Repeatable – multiple replications possible
  4. Flexible – assumptions easily modified
  5. Predictive – estimates future outcomes
  6. Experimental – safe virtual testing

10. Advantages of Simulation

  • minimizes risk and cost
  • saves time
  • improves decision quality
  • supports optimization
  • compares alternatives
  • works when analytical methods fail

11. Limitations of Simulation

  • depends on input data quality
  • poor assumptions give poor results
  • model building can be time-consuming
  • requires expert interpretation
  • does not always guarantee optimum solution

12. Major Types of Simulation

Type Description Example
Monte Carlo Random sampling Risk analysis
Discrete Event Event-based Queue systems
Continuous Continuous change Water tank
Agent-Based Interacting agents Crowd behavior
System Dynamics Feedback loops Population growth

13. Random Number Coding for Demand Distribution

The inverse transform method maps random numbers (00–99) to demand values.

Demand Probability Cumulative RN Interval
30 0.02 0.02 00–01
40 0.08 0.10 02–09
50 0.11 0.21 10–20
60 0.16 0.37 21–36
70 0.19 0.56 37–55
80 0.13 0.69 56–68
90 0.10 0.79 69–78
100 0.08 0.87 79–86
110 0.07 0.94 87–93
120 0.06 1.00 94–99

Example: RN = 47 → Demand = 70

Expected Demand:

14. Lead Time Distribution

Lead Time (days) Probability Cumulative RN Interval
2 0.20 0.20 00–19
3 0.30 0.50 20–49
4 0.35 0.85 50–84
5 0.15 1.00 85–99

Expected Lead Time:

Safety Stock:

15. Monte Carlo Simulation

Monte Carlo uses repeated random sampling.

Formula:

Steps

  1. Generate random numbers
  2. Map values
  3. Repeat many trials
  4. Compute average

Applications:

  • finance
  • risk analysis
  • reliability
  • forecasting

16. Discrete Event Simulation (DES)

Models systems where state changes at specific events.

Little’s Law:

Applications:

  • hospital queues
  • manufacturing
  • retail checkout

17. Continuous Simulation

State changes continuously over time.

Equation:

Applications:

  • water systems
  • chemical plants
  • fuel systems

18. Resource Utilization

Formula:

Idle Time:

Example: Busy = 8 hrs, Total = 10 hrs
Utilization = 80%

19. Verification and Validation

Verification: Are we building the model correctly?
Validation: Are we building the correct model?

Error:

20. Continuous Improvement

Improvement Formula:

Used in:

  • Lean
  • Six Sigma
  • Kaizen
  • layout optimization

21. Simulation Software Tools

Popular software:

Features:

  • drag-and-drop modeling
  • 2D/3D visualization
  • Excel/database integration
  • reporting dashboards

22. Viva Questions

Q1. What is simulation?
Virtual modeling of a real system.

Q2. What is Monte Carlo simulation?
Random sampling-based simulation.

Q3. DES vs Continuous?
DES = event-based; Continuous = differential equations.

Q4. What is validation?
Checking whether model matches reality.

Q5. What is Little’s Law?
L = λW


23. Conclusion

Simulation is a cornerstone technique in:

  • engineering,
  • operations research,
  • industrial management.

It helps to:

✔ reduce cost
✔ reduce risk
✔ improve efficiency
✔ optimize systems
✔ support intelligent decisions


Final One-Line Summary

Simulation helps us model reality, test alternatives, reduce uncertainty, and make intelligent engineering decisions.

— End of Integrated Laboratory Manual —


Sub section 2.0

SIMULATION STUDY

Last Updated: May 2026

TABLE OF CONTENTS

1.  Fundamentals of Simulation 3

1.1  Definition & Purpose 3

1.2  When to Use Simulation 3

1.3  Types of Simulation Models 4

2.  The Complete Simulation Lifecycle 5

2.1  Overview & Integrated Flow 5

2.2  Step-by-Step Breakdown (9 Steps) 5

3.  Pre-Simulation Decisions 7

3.1  Feasibility Assessment 7

3.2  Cost-Benefit Analysis 8

4.  Advanced Concepts & KPIs 8

4.1  Assumptions & Simplifications 8

4.2  Experimental Design 9

4.3  Sensitivity & Risk Analysis 9

5.  Case Study & Applications 10

6.  Best Practices & Conclusion 10

1.  

FUNDAMENTALS OF SIMULATION

1.1  Definition

& Purpose

Definition: Simulation is the process of creating a mathematical or computational model of a real-world system to study its behavior, predict outcomes, and support decision-making without directly experimenting on the actual system.

Core Objectives:

•  Predict future outcomes and system behavior

•  Improve strategic and tactical decisions

•  Reduce operational and financial risk

•  Optimize system performance and efficiency

•  Test multiple scenarios before implementation

•  Support training and system understanding

1.2  When

to Use Simulation

Criterion

Use Simulation

Use Analytical

Complexity

High (many variables, interactio

ns)Low (simple systems)

Randomness

High uncertainty

Deterministic

Time Dependency

Dynamic/time-varying

Static

Solution Method

Difficult/impossible analytically

Closed-form solution exists

Example

Traffic, queues, supply chains

Simple interest, linear equations

1.3  Types

of Simulation Models

Deterministic Simulation: No randomness; outputs are fixed for given inputs. Example: Machine capacity calculations, simple scheduling.

Stochastic Simulation: Includes probability and randomness. Example: Customer arrivals, demand variations, failures.

Discrete Event Simulation (DES): System state changes only when events occur. Example: Bank queues, manufacturing, healthcare.

Continuous Simulation: System variables change continuously over time. Example: Water tank level, temperature dynamics.

Monte Carlo Simulation: Repeated random sampling to estimate probability distributions. Example: Risk analysis, financial projections, project timelines.

2.  

THE COMPLETE SIMULATION LIFECYCLE

2.1  Integrated

Simulation Flow

A successful simulation study follows a structured, iterative process. The nine core steps must be executed sequentially with feedback loops for validation and refinement. Each step builds on previous outputs and feeds into decision-making.

2.2  Step-by-Step

Breakdown

STEP 1: Problem Definition

What is the problem? Why does it matter?

•  Clearly articulate the problem or opportunity.

•  Define business/operational objectives.

•  Identify scope: What is included? What is excluded?

•  Set measurable success criteria.

•  Document stakeholders and decision-makers.

Problem Statement & Objectives

STEP 2: Project Planning

How will we execute the simulation study?

•  Define timeline, budget, and resource allocation.

•  Assign responsibilities (data collectors, modelers, analysts).

•  Plan data collection strategy and timeline.

•  Identify tools and software required.

•  Create milestone checkpoints for quality assurance.****✓ Project Plan & Resource Schedule

STEP 3: System Definition

What are the system boundaries and components?

•  Identify all key system inputs (arrivals, demand, failures).

•  Define system outputs (throughput, cycle time, cost).

•  Map system components and their interactions.

•  Establish system boundaries and constraints.

•  Create system architecture diagrams.****✓ System Structure & Architecture

STEP 4: Model Formulation

How do we represent the system logically?

•  Translate real-world system into logical structure.

•  Create flowcharts or process diagrams.

•  Define entity types (customers, products, resources).

•  Specify entities, attributes, activities, and events.

•  Establish key assumptions explicitly.****✓ Conceptual Model & Flowcharts

STEP 5: Data Collection & Analysis

What data drives the simulation?

•  Gather historical data on arrival times, service durations, failures.

•  Calculate statistical measures: mean, variance, distribution type.

•  Fit data to probability distributions (Normal, Poisson, Exponential).

•  Test data for randomness and independence.

•  Document data sources and assumptions.

Validated Input Data & Distributions

STEP 6: Model Translation (Programming)

How do we implement the model computationally?

•  Choose simulation software (Arena, AnyLogic, MATLAB, Python, Excel).

•  Code or configure the logical model in selected tool.

•  Implement probability distributions and random number generation.

•  Build user interfaces and dashboards for output visualization.

•  Create configurable parameters for experimentation.****✓ Executable Simulation Program

STEP 7: Verification & Validation

Is the model correct and trustworthy?

•  Verification: "Are we building the model right?" – Debugging logic, checking for negative queues, animation review.

•  Validation: "Are we building the right model?" – Compare simulation output with real system (historical data or pilot).

•  Statistical tests (t-tests, confidence intervals) to compare real vs. simulated.

•  Face validation with subject matter experts.

•  Address discrepancies and refine model iteratively.****✓ Verified & Validated Model

STEP 8: Experimentation & Analysis

What scenarios should we test?

•  Design experiments: test alternative configurations, demand levels, staffing levels.

•  Run multiple replications (typically 50-200) to account for randomness.

•  Collect performance metrics: throughput, utilization, waiting time, cost.

•  Perform sensitivity analysis: "What if input X changes by 10%?"

•  Compare scenarios statistically (ANOVA, confidence intervals).

•  Identify best solution based on objectives.

Scenario Analysis & Recommendations

STEP 9: Documentation & Implementation

How do we communicate and implement findings?

•  Create comprehensive final report with findings and recommendations.

•  Prepare executive summary for decision-makers.

•  Develop visual dashboards and charts (graphs, heatmaps).

•  Provide implementation guidelines and transition plan.

•  Plan for ongoing monitoring and model maintenance.

•  Transfer knowledge to operations team.

Final Report & Implementation Plan

3.  

PRE-SIMULATION DECISIONS & FEASIBILITY

3.1  Feasibility

Assessment Criteria

Before investing in simulation, conduct a feasibility study to ensure the approach is justified.

Problem Complexity: Is the problem too complex for analytical solutions? Does it involve multiple interacting variables?

System Uncertainty: Does the system involve significant randomness or variability?

Time Dynamics: Does the system behavior depend critically on time-dependent events?

Data Availability: Can we collect sufficient, accurate, representative data?

Resource Availability: Do we have trained personnel, time, and computing power?

Cost-Benefit Ratio: Is the expected benefit (cost savings, improved decisions) > simulation cost?

3.2  Cost-Benefit

Analysis

Use this formula to justify simulation investment:

Net Benefit = Expected Annual Savings Simulation Development Cost Annual Maintenance Cost

Example: Manufacturing System Simulation

•  Expected reduction in inventory: ■500,000/year

•  Expected reduction in machine idle time: ■200,000/year

•  Improved scheduling efficiency: ■150,000/year

•  Total expected savings: ■850,000/year

•  Simulation development cost: ■100,000 (one-time)

•  Annual maintenance & updates: ■30,000/year

•  Year 1 Net Benefit: ■850,000 ■100,000 = ■750,000 ✓ POSITIVE

•  Payback period: ~1.4 months (highly justified)

4.  

ADVANCED CONCEPTS & PERFORMANCE METRICS

4.1  Assumptions

& Model Simplifications

Every model requires simplifying assumptions. These must be documented and validated.

Examples of Common Assumptions:

•  Employees work exactly 8 hours/day with no variations

•  No holidays or unplanned absences in the planning period

•  Machine failure rates remain constant (stationary)

•  Service times follow a specific distribution (e.g., exponential)

•  Customer arrivals are independent and random

•  System is in steady state by time T hours

Impact of Wrong Assumptions:

•  Inaccurate model output → Poor decisions

•  Model may not reflect real-world constraints

•  Risk of over-optimizing based on false premises

Best Practice: Document all assumptions explicitly. Conduct sensitivity analysis to test robustness.

4.2  Experimental

Design

Element

Description

Example

Number of Runs

How many replications?

100-500 runs

Warm-up Period

Initial time to reach steady stat

e 1000-5000 time units

Run Length

Simulation duration per run

8 hours, 1 week, 1 month

Random Seed

Initialize randomness identical

y or Different seeds for independently?

Batch Means

Group runs for statistical analy

Scratches of 10 runs

4.3  Key

Performance Indicators (KPIs)

KPI

Definition

Formula/Calculation

Application

Throughput

Items produced/service

d Total output / time

Production, queues

Utilization

Resource usage efficient

ncy(Busy time / Total time) × 1

00%Machines, staff

Waiting Time

Average time in queue

Sum of queue waits / count

Service systems

Cycle Time

Time from start to finish

Exit time Entry time

Manufacturing

Queue Length

Average number waiting

g Sum of lengths / observation

ns Bottleneck analysis

Cost

Total operational expen

Labour + Materials + Over

Ead financial analysis

Service Level

On-time performance %

(On-time deliveries / Total)

× 100%Supply chain

4.4  Sensitivity

Analysis

Purpose: Test robustness of model by varying inputs ±10-20% and observing output changes.

Example: If input demand increases by 15%, does output throughput increase linearly, or do bottlenecks cause disproportionate degradation? This identifies critical parameters.

4.5  Risk

& Uncertainty Analysis

Monte Carlo Risk Analysis: Run simulation 5,000–10,000 times with random variations to estimate probability distributions of outcomes. This provides confidence intervals and risk quantification.

Example: Project completion time could be 45–65 days with 85% confidence; 35% chance of exceeding 60 days.

5.           

INDUSTRIAL CASE STUDY: CONVEYOR Line optimization

Problem Statement

A manufacturing facility with a Station assembly line is experiencing production bottlenecks. Machine 2 consistently has the longest queue, reducing overall throughput by 20%. Management must decide whether to add a second identical machine or implement process improvements.

Simulation

Approach

Step 1 – Data Collection: Recorded 500 part processing times for each machine, fitted to distributions.

Step 2 – Model Formulation: Created DES model with 5 machines, FIFO queues, random arrivals.

Step 3 – Base Case Simulation: Ran 100 replications over 40 working days.

Step 4 – Scenario Testing:

•  Scenario A: Add second Machine 2 (cost: ■500,000)

•  Scenario B: Improve Machine 2 speed by 15% (cost: ■100,000)

•  Scenario C: Implement parallel processing (cost: ■300,000) Step 5 – Results Analysis:

Results Comparison

Metric

Base Case

Scenario A (Add Machine)

Scenario B (Speed +15

%)Scenario C (Parallel)

Throughput (parts/day)

240

286 (+19%)

268 (+12%)

280 (+17%)

Avg Machine 2 Queue

8.2 parts

2.1 parts

4.5 parts

3.0 parts

Machine 2 Utilization

92%

65%

85%

75%

Capital Cost

**■**500K

**■**100K

**■**300K

Payback Period

6.5 months

2.1 months

4.8 months

Recommendation

Implementation: Scenario B (Process Improvement). Although Scenario A offers highest throughput, Scenario B provides the best ROI (2.1 months payback) with lower capital risk. Recommended next steps: pilot the speed improvement on Machine 2, monitor results, and revisit addition of second machine if demand grows.

6.  

BEST PRACTICES & CONCLUSION

6.1  Advantages

of Simulation

✓  Safe Experimentation: Test ideas without disrupting real operations.

✓  Cost-Effective: Avoid expensive real-world mistakes.

✓  Multiple Scenarios: Explore dozens of alternatives quickly.

✓  Time Compression: Run months of operation in seconds.

✓  Risk Quantification: Probabilistic estimates with confidence intervals.

✓  Strategic Planning: Long-term 'what-if' analysis for capacity, investment, expansion.

✓  Communication: Visual animations persuade stakeholders more than reports.

✓  Training: Use model as a sandbox for staff learning and training.

6.2  Limitations

to Consider

Development Time: Complex models take weeks/months to build and validate.

Data Requirements: Garbage in → garbage out. Poor data = poor results.

Expert Dependency: Requires skilled modelers; results vary by model builder.

Model Simplification: Reality is always more complex; some details omitted.

Behavioral Assumptions: Model may not capture human adaptability or learning.

Over-Optimization: Risk of optimizing for metrics that don't reflect true business value.

6.3  Simulation

Best Practices

1.     Start Simple: Build a baseline model first, add complexity incrementally.

2.     Validate Rigorously: Spend 40% of time on validation/verification.

3.     Document Everything: Assumptions, data sources, code logic, validation results.

4.     Engage Stakeholders: Get feedback from domain experts throughout development.

5.     Use Real Data: Collect actual historical data; don't guess.

6.     Plan Experiments: Design factorial or screening experiments, not random tests.

7.     Report Confidence Intervals: Never report single-point estimates; always include ranges.

8.     Maintain the Model: Update as business processes change; old models become useless.

9.     Conduct Sensitivity Analysis: Identify which inputs most impact outputs.

10.  Communicate Results Visually: Use dashboards, animations, and charts for impact.

6.4  Simulation

Success Checklist

■ Problem is clearly defined with measurable objectives.

■ Feasibility study justifies investment (cost-benefit positive).

■ Required data is available and validated.

■ Team has necessary expertise in modeling, programming, and statistics.

■ System boundaries and assumptions are explicitly documented.

■ Model is verified (logic is correct) and validated (matches real system).

■ Sufficient experimental replications planned (minimum 50-100).

■ Key performance metrics defined and tracked.

■ Sensitivity analysis identifies critical input variables.

■ Results are communicated with confidence intervals, not point estimates.

■ Recommendations are actionable and prioritize by impact.

■ Implementation plan includes monitoring, review, and maintenance.

6.5  Complete

Simulation Lifecycle Overview

1.     Problem Identification → Define objective

2.     Feasibility Study → Justify investment

3.     Project Planning → Schedule, budget, team

4.     System Definition → Boundaries, components, inputs/outputs

5.     Model Formulation → Convert to equations, flowcharts, logic

6.     Data Collection & Analysis → Validate, fit distributions

7.     Programming (Translation) → Code in chosen tool

8.     Verification → Logic is correct? (debugging)

9.     Validation → Model matches reality? (statistical tests)

10.  Experimentation → Run scenarios, collect metrics

11.  Analysis & Optimization → Compare, identify best alternative

12.  Recommendation → Report findings and proposed action

13.  Implementation → Execute decision, monitor

14.  Continuous Review → Update model as business changes

6.6  Final

Expert Guidance

What Makes a Good Simulation Model?

✓  Valid: Accurately represents the real system.

✓  Reliable: Produces consistent, reproducible results.

✓  Simple: Only as complex as necessary; Occam's Razor principle.

✓  Flexible: Easy to modify for new scenarios and questions.

✓  Cost-Effective: Development cost justified by value delivered.

✓  Decision-Oriented: Directly supports the decision-maker's question.

When Simulation Fails:

Too much focus on model complexity vs. problem clarity.

Poor data quality undermining results credibility.

Model built for wrong stakeholder or problem.

Insufficient validation before running experiments.

Results treated as 'the answer' rather than decision support.

Key Takeaway:

Simulation is not a black box that produces definitive answers. Rather, it is a powerful structured methodology for exploring system behavior, reducing risk, and making informed decisions under uncertainty. Successful simulation requires: correct steps + correct data + correct interpretation = valuable insights.

Sub section 2.1

Your upgraded content is excellent for M.Tech / Advanced Engineering. To make it fully academic manual / dissertation appendix / lab record standard, I’ve refined the title, formatting hierarchy, notation consistency, and added a few missing advanced academic elements (scope, assumptions, deliverables, and research orientation).


SIMULATION STUDY

Integrated Framework & Methodology

M.Tech / Advanced Engineering Level

Complete Lifecycle • Decision Framework • Advanced Concepts • Industrial Applications
Version: May 2026


ABSTRACT

Simulation is a computational and analytical methodology used to model, analyze, and optimize complex real-world systems under uncertainty. It enables decision-makers to evaluate alternative scenarios, quantify risk, and improve system performance without disturbing actual operations. This manual presents a complete simulation lifecycle—from problem definition to implementation—integrating theory, methodology, industrial practice, and advanced decision frameworks.


1. FUNDAMENTALS OF SIMULATION

1.1 Definition

Simulation is the process of constructing a mathematical or computational representation of a real-world system and experimenting on that model to understand system behavior, predict outcomes, and support decisions.

Mathematical Representation

Where:

  • Y = system output
  • X = input variables
  • P = model parameters
  • R = randomness/stochastic effects
  • t = time

1.2 Purpose

  • Predict future outcomes
  • Reduce risk
  • Optimize resources
  • Support policy decisions
  • Compare alternatives
  • Enable virtual experimentation

1.3 When to Use Simulation

Condition Simulation Preferred Analytical Preferred
Complexity High Low
Randomness Present Minimal
Time dependence Dynamic Static
Closed-form solution Not available Available

1.4 Types of Simulation

  1. Deterministic Simulation – fixed outputs
  2. Stochastic Simulation – includes randomness
  3. Discrete Event Simulation (DES) – event-based
  4. Continuous Simulation – differential equations
  5. Monte Carlo Simulation – repeated random sampling
  6. Agent-Based Simulation – interacting autonomous agents
  7. Hybrid Simulation – mixed methodologies

2. COMPLETE SIMULATION LIFECYCLE

Integrated Flow

Problem Definition

Feasibility Analysis

Project Planning

System Definition

Model Formulation

Data Collection & Analysis

Model Translation

Verification & Validation

Experimentation

Optimization

Recommendation

Implementation & Monitoring

Step 1: Problem Definition

Define:

  • objective
  • constraints
  • scope
  • stakeholders
  • measurable success metrics

Output: Problem Statement


Step 2: Feasibility Study

Assess:

  • technical feasibility
  • economic feasibility
  • data availability
  • organizational readiness

Net Benefit:

Step 3: Project Planning

Plan:

  • timeline
  • budget
  • resources
  • milestones
  • risk register

Step 4: System Definition

Define:

  • boundaries
  • inputs
  • outputs
  • entities
  • resources
  • constraints

Step 5: Model Formulation

Develop:

  • logical model
  • process maps
  • assumptions
  • equations
  • flowcharts

Step 6: Data Collection & Analysis

Tasks:

  • collect historical data
  • clean data
  • fit probability distributions
  • validate independence
  • estimate parameters

Common distributions:

  • Normal
  • Poisson
  • Exponential
  • Weibull
  • Uniform

Step 7: Model Translation

Typical tools:





Deliverable: executable simulation model

Step 8: Verification & Validation

Verification

“Did we build the model correctly?”

Check:

  • logic
  • coding
  • event handling
  • negative queues
  • unit consistency

Validation

“Did we build the correct model?”

Compare:

Methods:

  • t-test
  • confidence intervals
  • expert review

Step 9: Experimentation & Analysis

Perform:

  • multiple replications
  • scenario testing
  • sensitivity analysis
  • optimization runs

Typical replications: 50–500

3. ADVANCED CONCEPTS

3.1 Assumptions

Examples:

  • constant failure rate
  • steady-state operation
  • independent arrivals
  • fixed shift duration

Rule: Document every assumption explicitly.

3.2 Warm-Up Period

Used to remove initialization bias.

Example: Ignore first 1000 time units.

3.3 Random Seeds

Control reproducibility.

Same seed → same output
Different seed → independent runs

3.4 Sensitivity Analysis

Test:

±10% to ±20% parameter changes

Purpose: Identify critical variables.

3.5 Risk Analysis

Monte Carlo:

Typical runs: 5000–10000

Output:

  • mean
  • variance
  • confidence intervals
  • exceedance probability

4. KEY PERFORMANCE INDICATORS (KPIs)

Throughput


Utilization


Little’s Law


Cycle Time


Service Level


5. INDUSTRIAL CASE STUDY

Conveyor Line Optimization

Problem: Machine 2 bottleneck causing 20% throughput loss.

Scenarios: A. Add machine
B. Increase speed +15%
C. Parallel process

Recommendation: Scenario B

Reason: Best ROI and lowest risk.

6. ADVANTAGES

✓ Safe experimentation
✓ Faster decision-making
✓ Reduced cost
✓ Better planning
✓ Visual communication
✓ Risk quantification

7. LIMITATIONS

✗ depends on input quality
✗ time-intensive
✗ requires expertise
✗ simplification bias
✗ may miss human behavior

8. BEST PRACTICES

  1. Start simple
  2. Validate rigorously
  3. Use real data
  4. Document assumptions
  5. Run sensitivity analysis
  6. Report confidence intervals
  7. Maintain model over time

9. RESEARCH APPLICATIONS

Used in:

10. CONCLUSION

Simulation is not a machine that gives answers automatically.

It is a scientific decision-support methodology.

Success depends on:

Correct Problem + Correct Data + Correct Model + Correct Interpretation

=

Reliable Decision Support

End of Advanced Simulation Study Manual
(Suitable for M.Tech, PhD coursework, dissertation appendix, industrial training, and advanced viva)

Now Finally refreshed are 

content into a formal academic document format suitable for M.Tech Lab Manual / Dissertation Appendix / Lab Record / Viva / Presentation.


SIMULATION EXPERIMENT / LABORATORY MANUAL

Study and Analysis of Simulation Techniques for Modeling and Optimization of Real-World Systems

Course Level: M.Tech / Advanced Engineering
Subject Area: /
Version: May 2026


ABSTRACT

Simulation is a computational and analytical technique used to imitate the behavior of real-world systems through mathematical or virtual models. It enables engineers, researchers, and managers to analyze system performance, evaluate alternative decisions, and optimize outcomes without disturbing the actual system.

This laboratory manual presents the theoretical foundation, mathematical principles, experimental procedures, practical applications, and validation methods used in simulation studies.


1. INTRODUCTION

Simulation is a scientific method used to model complex systems and observe their behavior over time under different conditions.

It is especially useful when:

  • real-world experiments are expensive,
  • physical testing is risky,
  • systems are highly complex,
  • analytical solutions are difficult or impossible.

General Mathematical Representation


Y = f(X, M, R, t)

Where:

  • Y = Output
  • X = Input Variables
  • M = Model Structure
  • R = Randomness / Uncertainty
  • t = Time

2. AIM

To study the concept, methodology, and practical applications of simulation and analyze system performance for effective decision-making and optimization.


3. OBJECTIVES

  1. Understand simulation fundamentals.
  2. Study different simulation models.
  3. Generate and analyze random variables.
  4. Apply Monte Carlo and DES techniques.
  5. Evaluate system performance.
  6. Verify and validate simulation models.

4. APPARATUS / SOFTWARE REQUIRED

  • Computer / Laptop
  • Microsoft Excel
  • Statistical Data
  • Simulation Software:

5. THEORY

5.1 Definition

Simulation is the imitation of a real-world system using a mathematical or computer model to study its behavior over time.

Standard Definition

Simulation is the imitation of the operation of a real-world system over time.

Technical Definition

Simulation is a computer-based experimental technique used to evaluate system behavior under varying assumptions.

Simple Definition

Simulation means testing ideas virtually before implementing them in reality.


5.2 Mathematical Model


S(t)=f(X(t),P,R)

Where:

  • S(t) = System state at time t
  • X(t) = Time-dependent inputs
  • P = Parameters
  • R = Random effects

6. TYPES OF SIMULATION

Type Description Example
Monte Carlo Random sampling Risk analysis
Discrete Event Event-based Queue system
Continuous Continuous change Water tank
Agent-Based Individual agents Crowd behavior
System Dynamics Feedback systems Population growth

7. PROCEDURE

Step 1: Problem Identification

Define system boundaries and objectives.

Step 2: Model Development

Create logical/mathematical model.

Step 3: Data Collection

Collect input variables and probability data.

Step 4: Random Number Generation

Generate random numbers (00–99).

Step 5: Simulation Run

Execute multiple trials.

Step 6: Output Analysis

Analyze results and compare alternatives.

Step 7: Validation

Check model accuracy.


8. OBSERVATION TABLE

8.1 Demand Distribution

Demand Probability Cumulative RN Interval
30 0.02 0.02 00–01
40 0.08 0.10 02–09
50 0.11 0.21 10–20
60 0.16 0.37 21–36
70 0.19 0.56 37–55
80 0.13 0.69 56–68
90 0.10 0.79 69–78
100 0.08 0.87 79–86
110 0.07 0.94 87–93
120 0.06 1.00 94–99

Example: RN = 47 → Demand = 70 units


8.2 Lead Time Distribution

Lead Time Probability Cumulative RN Interval
2 0.20 0.20 00–19
3 0.30 0.50 20–49
4 0.35 0.85 50–84
5 0.15 1.00 85–99

9. CALCULATIONS

Expected Demand


E(D)=\sum x_i p_i

Monte Carlo Estimate


\hat{Y}=\frac{1}{N}\sum y_i

Little’s Law

Utilization


U=\frac{Busy\ Time}{Total\ Time}

Safety Stock


SS=z\sigma_L

10. RESULTS

The simulation successfully modeled the real-world system and generated useful outputs for:

  • prediction,
  • optimization,
  • decision support,
  • risk reduction.

11. ADVANTAGES

  • Low cost
  • Risk free
  • Faster analysis
  • Repeatable
  • Flexible
  • Supports optimization

12. LIMITATIONS

  • Depends on data quality
  • Time-consuming model development
  • Requires expert validation
  • May not guarantee optimum solution

13. VERIFICATION AND VALIDATION

Verification: Are we building the model correctly?

Validation: Are we building the correct model?

Error Formula:


Error = |Actual - Simulated|

14. APPLICATIONS

  • Mechanical Engineering
  • Manufacturing Systems
  • Healthcare
  • Transportation
  • Supply Chain
  • Finance
  • Aerospace

15. VIVA QUESTIONS

Q1. What is simulation?
Virtual representation of a real-world system.

Q2. What is Monte Carlo simulation?
Random sampling-based simulation.

Q3. Difference between verification and validation?
Verification = model correctness; Validation = real-world accuracy.

Q4. What is Little’s Law?

Q5. Why is simulation important?
It reduces risk and improves decision quality.


16. CONCLUSION

Simulation is an essential engineering decision-support methodology used to model complex systems, reduce uncertainty, optimize resources, and improve decision-making.

Final Principle


Correct\ Problem + Correct\ Data + Correct\ Model + Correct\ Interpretation

=

Reliable Decision Support


— End of Simulation Laboratory Manual —


F Test, Z test chi rest software lab


  1. F-Test (Snedecor’s F-Distribution)

Definition

The F-distribution is a sampling distribution used to compare the variances of two independent samples. If

· X has a chi-square distribution with d_1 degrees of freedom (DOF),

· Y has a chi-square distribution with d_2 DOF,

then

F = \frac{X/d_1}{Y/d_2} \quad \text{follows an F-distribution with } (d_1, d_2) \text{ DOF}.

For two independent samples from normal populations with the same variance:

F = \frac{S_1^2}{S_2^2} = \frac{\sum_{i=1}^{n_1} (x_i - \bar{x}1)^2 / (n_1 - 1)}{\sum{j=1}^{n_2} (y_j - \bar{y}_2)^2 / (n_2 - 1)}

Rule: The larger variance is always placed in the numerator → F \ge 1.

Procedure for F-Test

  1. Null hypothesis H_0: \sigma_1^2 = \sigma_2^2 (no significant difference between variances).

  2. Alternative hypothesis H_a (one- or two-tailed as per problem).

  3. Compute sample means:

    \bar{x}_1 = \frac{\sum x_1}{n_1}, \quad \bar{x}_2 = \frac{\sum x_2}{n_2}

    ]

  4. Compute sample variances S_1^2 and S_2^2:

    S_1^2 = \frac{\sum (x_i - \bar{x}_1)^2}{n_1 - 1}, \quad S_2^2 = \frac{\sum (x_j - \bar{x}_2)^2}{n_2 - 1}

    ]

    (If variances are given directly, use them.)

  5. Calculate F_c = \frac{\text{larger variance}}{\text{smaller variance}}.

  6. Compare with F-table value at given \alpha and DOF (n_1-1, n_2-1).

Acceptance criterion:

· If F_c < F_{\text{table}} → Accept H_0 (variances are equal).

· If F_c \ge F_{\text{table}} → Reject H_0 (variances differ significantly).

Worked Example – Packaging Machine Weights

Data: Two machines A and B, each with 10 packs. Nominal weight should be consistent.

Given data (corrected from PDF):

Machine A 50.8 51.0 49.5 52.1 51.8 41.4 51.5 49.0 48.0 –

Actually from PDF: Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0, and one more? Let's reconstruct properly.

From pages 4-5:

Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0? Incomplete. But the calculation in PDF used n_1=10 and got mean 49.93. We'll trust the calculation.

Given in PDF:

\bar{x}_1 = 49.93, \bar{x}_2 = 49.03

S_1^2 = 2.9709, S_2^2 = 0.4506

F_c = \frac{2.9709}{0.4506} = 6.5932

]

DOF = (9, 9), \alpha = 0.05, F_{\text{table}} = 3.18

Since 6.5932 > 3.18 → Reject H_0. Conclude machines have significantly different variances.


  1. Chi-Square (\chi^2) Test – Goodness of Fit

Definition

Used for categorical variables to test how well observed data fit an expected distribution.

\chi^2 = \sum \frac{(O - E)^2}{E}

]

Where O = observed frequency, E = expected frequency.

Properties

· Only positive values, skewed right.

· Family of distributions indexed by degrees of freedom (DF).

· DF = k - 1 (where k = number of categories).

Acceptance Criteria (at significance level \alpha)

· If \chi^2_{\text{stat}} > \chi^2_{\text{critical}}(\alpha, k-1) → Reject H_0.

· If \chi^2_{\text{stat}} \le \chi^2_{\text{critical}} → Accept H_0 (or fail to reject).

Worked Example – Coin Toss

A coin tossed 100 times, heads observed 65 times. Test bias at \alpha = 0.01.

Hypotheses:

H_0: Coin is fair (Heads = Tails = 50)

H_a: Coin is biased

Observed: O_H = 65, O_T = 35

Expected: E_H = 50, E_T = 50

\chi^2 = \frac{(65-50)^2}{50} + \frac{(35-50)^2}{50} = \frac{225}{50} + \frac{225}{50} = 4.5 + 4.5 = 9

]

With Yates’ correction (for small expected frequencies sometimes, but here n large):

PDF shows a correction term -0.5 inside numerator:

\frac{(65-50-0.5)^2}{50} + \frac{(35-50+0.5)^2}{50} = \frac{(14.5)^2}{50} + \frac{(-14.5)^2}{50} = \frac{210.25}{50} \times 2 = 8.41

]

Critical value: \chi^2_{0.01, 1} = 6.635

Since 9 > 6.635 (or 8.41 > 6.635) → Reject H_0. Coin is biased.


  1. Student’s t-Distribution

Definition

Used when sample size is small (n \le 30) and population variance \sigma is unknown. Developed by W.S. Gosset (pseudonym “Student”).

t = \frac{\bar{x} - \mu}{S / \sqrt{n}}, \quad \text{where } S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2

· \bar{x} = sample mean, \mu = population mean, n = sample size, S = sample standard deviation.

Properties

· Ranges from -\infty to +\infty.

· Bell-shaped, symmetric about 0, but heavier tails than normal.

· DOF = n - 1.

· Used when population standard deviation unknown.

Types of t-Tests

  1. One-sample t-test – compares sample mean to a known population mean.

  2. Independent two-sample t-test – compares means of two independent groups.

    t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}}

    ]

  3. Paired t-test – compares two related samples (e.g., before and after).

Acceptance Criteria

· If |t_{\text{calc}}| > t_{\text{critical}} → Reject H_0.

· If |t_{\text{calc}}| \le t_{\text{critical}} → Accept H_0.


  1. ANOVA – Analysis of Variance

Definition

Compares means of more than two populations simultaneously. Developed by R.A. Fisher.

Example uses:

· Yield of crop from several seed varieties.

· Smoking habits across multiple groups.

· Gasoline mileage of different automobiles.

Procedure (One-Way ANOVA)

  1. Compute mean of each sample: \bar{x}_1, \bar{x}_2, \dots, \bar{x}_k.

  2. Compute overall mean: \bar{\bar{x}} = \frac{\sum \bar{x}_i}{k} (weighted by sample sizes if unequal).

  3. Variance between groups (treatment variance):

    SS_{\text{between}} = \sum_{i=1}^{k} n_i (\bar{x}_i - \bar{\bar{x}})^2

    ]

  4. Variance within groups (error variance):

    SS_{\text{within}} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (x_{ij} - \bar{x}_i)^2

    ]

  5. Compute F = \frac{MS_{\text{between}}}{MS_{\text{within}}}, where MS = SS/DF.

  6. Compare with F-table (DOF between = k-1, DOF within = N-k).

Worked Example – Studying Methods

Three methods (A, B, C), each with 10 students. Test if mean scores differ.

Data summary (from PDF):

Method A mean = 8.7, B mean = 8.6, C mean = 8.5, overall mean = 8.6.

Between-group variance:

10(8.7-8.6)^2 + 10(8.6-8.6)^2 + 10(8.5-8.6)^2 = 10(0.01) + 0 + 10(0.01) = 0.2

]

Within-group variance (sum of squared deviations inside each method):

Given in PDF: SS_A = 6.6, SS_B = 10.9, SS_C = 10.5 → Total SS_{\text{within}} = 28.0

ANOVA table:

Source SS DF MS F

Between 0.2 2 0.1 0.1/0.966 ≈ 0.1035

Within 28.0 27 1.037

Total 28.2 29

Wait, correction: MS_{\text{within}} = 28/27 ≈ 1.037. Then F = 0.1 / 1.037 ≈ 0.096. PDF says 0.0071? Possibly miscalculation. But the interpretation: F is very small (<1), so no significant difference between methods.

Acceptance: If F_{\text{calc}} < F_{\text{critical}}, accept H_0 (all means equal).


  1. Design of Experiments (DOE) – Simple Factorial

Example Table (2 Factors)

Experiment No Temperature (°C) Pressure (Bar) Output Quality

1 Low Low 70

2 Low High 75

3 High Low 80

4 High High 90

Conclusion: High temperature and high pressure give the best output quality.


Summary Diagram of Statistical Test Selection

  
                             ┌─────────────────────┐
  
                             │  What is your goal? │
  
                             └──────────┬──────────┘
  
                                        │
  
            ┌───────────────────────────┼───────────────────────────┐
  
            │                           │                           │
  
            ▼                           ▼                           ▼
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐
  
   │ Compare variance│        │ Compare means   │        │ Compare means   │
  
   │ of 2 groups     │        │ of 1 group to   │        │ of >2 groups    │
  
   │                 │        │ known value     │        │                 │
  
   └────────┬────────┘        └────────┬────────┘        └────────┬────────┘
  
            │                          │                          │
  
            ▼                          ▼                          ▼
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐
  
   │    F-test       │        │  One-sample     │        │   ANOVA         │
  
   │                 │        │  t-test         │        │  (F-test)       │
  
   └─────────────────┘        └─────────────────┘        └─────────────────┘
  

  
   For categorical data (goodness of fit) → Chi-square test
  

  
Sub section 1.2
  
 
  
Statistical Tests – Integrated Notes
  
1.	F-Test (Snedecor’s F-Distribution)
  
Definition
  
The F-distribution is a sampling distribution used to compare the variances of two independent samples. If
  
· X has a chi-square distribution with d_1 degrees of freedom (DOF),
  
· Y has a chi-square distribution with d_2 DOF,
  
then
  
F = \frac{X/d_1}{Y/d_2} \quad \text{follows an F-distribution with } (d_1, d_2) \text{ DOF}.
  
For two independent samples from normal populations with the same variance:
  
F = \frac{S_1^2}{S_2^2} = \frac{\sum_{i=1}^{n_1} (x_i - \bar{x}1)^2 / (n_1 - 1)}{\sum{j=1}^{n_2} (y_j - \bar{y}_2)^2 / (n_2 - 1)}
  
Rule: The larger variance is always placed in the numerator → F \ge 1.
  
Procedure for F-Test
  
1.	Null hypothesis H_0: \sigma_1^2 = \sigma_2^2 (no significant difference between variances).
  
2.	Alternative hypothesis H_a (one- or two-tailed as per problem).
  
3.	Compute sample means:
  
\bar{x}_1 = \frac{\sum x_1}{n_1}, \quad \bar{x}_2 = \frac{\sum x_2}{n_2}
  
]
  
4.	Compute sample variances S_1^2 and S_2^2:
  
S_1^2 = \frac{\sum (x_i - \bar{x}_1)^2}{n_1 - 1}, \quad S_2^2 = \frac{\sum (x_j - \bar{x}_2)^2}{n_2 - 1}
  
]
  
(If variances are given directly, use them.)
  
5.	Calculate F_c = \frac{\text{larger variance}}{\text{smaller variance}}.
  
6.	Compare with F-table value at given \alpha and DOF (n_1-1, n_2-1).
  
Acceptance criterion:
  
· If F_c < F_{\text{table}} → Accept H_0 (variances are equal).
  
· If F_c \ge F_{\text{table}} → Reject H_0 (variances differ significantly).
  
Worked Example – Packaging Machine Weights
  
Data: Two machines A and B, each with 10 packs. Nominal weight should be consistent.
  
Given data (corrected from PDF):
  
Machine A 50.8 51.0 49.5 52.1 51.8 41.4 51.5 49.0 48.0 –
  
Actually from PDF: Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0, and one more? Let's reconstruct properly.
  
From pages 4-5:
  
Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0? Incomplete. But the calculation in PDF used n_1=10 and got mean 49.93. We'll trust the calculation.
  
Given in PDF:
  
\bar{x}_1 = 49.93, \bar{x}_2 = 49.03
  
S_1^2 = 2.9709, S_2^2 = 0.4506
  
F_c = \frac{2.9709}{0.4506} = 6.5932
  
]
  
DOF = (9, 9), \alpha = 0.05, F_{\text{table}} = 3.18
  
Since 6.5932 > 3.18 → Reject H_0. Conclude machines have significantly different variances.
  
 
  
2.	Chi-Square (\chi^2) Test – Goodness of Fit
  
Definition
  
Used for categorical variables to test how well observed data fit an expected distribution.
  
\chi^2 = \sum \frac{(O - E)^2}{E}
  
]
  
Where O = observed frequency, E = expected frequency.
  
Properties
  
· Only positive values, skewed right.
  
· Family of distributions indexed by degrees of freedom (DF).
  
· DF = k - 1 (where k = number of categories).
  
Acceptance Criteria (at significance level \alpha)
  
· If \chi^2_{\text{stat}} > \chi^2_{\text{critical}}(\alpha, k-1) → Reject H_0.
  
· If \chi^2_{\text{stat}} \le \chi^2_{\text{critical}} → Accept H_0 (or fail to reject).
  
Worked Example – Coin Toss
  
A coin tossed 100 times, heads observed 65 times. Test bias at \alpha = 0.01.
  
Hypotheses:
  
H_0: Coin is fair (Heads = Tails = 50)
  
H_a: Coin is biased
  
Observed: O_H = 65, O_T = 35
  
Expected: E_H = 50, E_T = 50
  
\chi^2 = \frac{(65-50)^2}{50} + \frac{(35-50)^2}{50} = \frac{225}{50} + \frac{225}{50} = 4.5 + 4.5 = 9
  
]
  
With Yates’ correction (for small expected frequencies sometimes, but here n large):
  
PDF shows a correction term -0.5 inside numerator:
  
\frac{(65-50-0.5)^2}{50} + \frac{(35-50+0.5)^2}{50} = \frac{(14.5)^2}{50} + \frac{(-14.5)^2}{50} = \frac{210.25}{50} \times 2 = 8.41
  
]
  
Critical value: \chi^2_{0.01, 1} = 6.635
  
Since 9 > 6.635 (or 8.41 > 6.635) → Reject H_0. Coin is biased.
  
 
  
3.	Student’s t-Distribution
  
Definition
  
Used when sample size is small (n \le 30) and population variance \sigma is unknown. Developed by W.S. Gosset (pseudonym “Student”).
  
t = \frac{\bar{x} - \mu}{S / \sqrt{n}}, \quad \text{where } S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2
  
· \bar{x} = sample mean, \mu = population mean, n = sample size, S = sample standard deviation.
  
Properties
  
· Ranges from -\infty to +\infty.
  
· Bell-shaped, symmetric about 0, but heavier tails than normal.
  
· DOF = n - 1.
  
· Used when population standard deviation unknown.
  
Types of t-Tests
  
1.	One-sample t-test – compares sample mean to a known population mean.
  
2.	Independent two-sample t-test – compares means of two independent groups.
  
t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}}
  
]
  
3.	Paired t-test – compares two related samples (e.g., before and after).
  
Acceptance Criteria
  
· If |t_{\text{calc}}| > t_{\text{critical}} → Reject H_0.
  
· If |t_{\text{calc}}| \le t_{\text{critical}} → Accept H_0.
  
 
  
4.	ANOVA – Analysis of Variance
  
Definition
  
Compares means of more than two populations simultaneously. Developed by R.A. Fisher.
  
Example uses:
  
· Yield of crop from several seed varieties.
  
· Smoking habits across multiple groups.
  
· Gasoline mileage of different automobiles.
  
Procedure (One-Way ANOVA)
  
1.	Compute mean of each sample: \bar{x}_1, \bar{x}_2, \dots, \bar{x}_k.
  
2.	Compute overall mean: \bar{\bar{x}} = \frac{\sum \bar{x}_i}{k} (weighted by sample sizes if unequal).
  
3.	Variance between groups (treatment variance):
  
SS_{\text{between}} = \sum_{i=1}^{k} n_i (\bar{x}_i - \bar{\bar{x}})^2
  
]
  
4.	Variance within groups (error variance):
  
SS_{\text{within}} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (x_{ij} - \bar{x}_i)^2
  
]
  
5.	Compute F = \frac{MS_{\text{between}}}{MS_{\text{within}}}, where MS = SS/DF.
  
6.	Compare with F-table (DOF between = k-1, DOF within = N-k).
  
Worked Example – Studying Methods
  
Three methods (A, B, C), each with 10 students. Test if mean scores differ.
  
Data summary (from PDF):
  
Method A mean = 8.7, B mean = 8.6, C mean = 8.5, overall mean = 8.6.
  
Between-group variance:
  
10(8.7-8.6)^2 + 10(8.6-8.6)^2 + 10(8.5-8.6)^2 = 10(0.01) + 0 + 10(0.01) = 0.2
  
]
  
Within-group variance (sum of squared deviations inside each method):
  
Given in PDF: SS_A = 6.6, SS_B = 10.9, SS_C = 10.5 → Total SS_{\text{within}} = 28.0
  
ANOVA table:
  
Source SS DF MS F
  
Between 0.2 2 0.1 0.1/0.966 ≈ 0.1035
  
Within 28.0 27 1.037
  
Total 28.2 29
  
Wait, correction: MS_{\text{within}} = 28/27 ≈ 1.037. Then F = 0.1 / 1.037 ≈ 0.096. PDF says 0.0071? Possibly miscalculation. But the interpretation: F is very small (<1), so no significant difference between methods.
  
Acceptance: If F_{\text{calc}} < F_{\text{critical}}, accept H_0 (all means equal).
  
 
  
5.	Design of Experiments (DOE) – Simple Factorial
  
Example Table (2 Factors)
  
Experiment No Temperature (°C) Pressure (Bar) Output Quality
  
1 Low Low 70
  
2 Low High 75
  
3 High Low 80
  
4 High High 90
  
Conclusion: High temperature and high pressure give the best output quality.
  
 
  
Summary Diagram of Statistical Test Selection
  
                             ┌─────────────────────┐  
  
                             │  What is your goal? │  
  
                             └──────────┬──────────┘  
  
                                        │  
  
            ┌───────────────────────────┼───────────────────────────┐  
  
            │                           │                           │  
  
            ▼                           ▼                           ▼  
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐  
  
   │ Compare variance│        │ Compare means   │        │ Compare means   │  
  
   │ of 2 groups     │        │ of 1 group to   │        │ of >2 groups    │  
  
   │                 │        │ known value     │        │                 │  
  
   └────────┬────────┘        └────────┬────────┘        └────────┬────────┘  
  
            │                          │                          │  
  
            ▼                          ▼                          ▼  
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐  
  
   │    F-test       │        │  One-sample     │        │   ANOVA         │  
  
   │                 │        │  t-test         │        │  (F-test)       │  
  
   └─────────────────┘        └─────────────────┘        └─────────────────┘  
  
  
  
   For categorical data (goodness of fit) → Chi-square test  
  
 
  
Let me know if you would like these notes converted into a PDF, flashcards, or a presentation.Enhanced Statistical Tests – Integrated Study Notes
  
Below is your reorganized content with my additions: key assumptions, when to use, limitations, additional formulas/variations, effect size interpretations, common pitfalls, and non-parametric alternatives where relevant. I’ve also corrected minor inconsistencies (e.g., ANOVA calculations) and added practical insights from standard statistical practice.
  
1. F-Test (Variance Comparison)
  
Core Formula
  
If and , then
  
For samples:
  
(larger variance in numerator → )
  
Key Assumptions
  
•	Populations are normally distributed.
  
•	Samples are independent.
  
•	Robust to moderate non-normality for large samples, but sensitive with small .
  
Procedure Additions
  
•	Always use upper-tail critical value when larger variance is in numerator.
  
•	For two-tailed test: compare to or use appropriately.
  
•	Effect size: Variance ratio itself (e.g., means ~6.6× more variable).
  
Worked Example (Packaging Machines) – Your values check out:
  
, , , 
  
→ Reject . Machines have significantly different precision.
  
Pitfall: Do not use F-test on non-normal data (especially heavy tails). Consider Levene’s or Brown-Forsythe test instead.
  
2. Chi-Square () Tests
  
Goodness-of-Fit
  

  
DF = (or if parameters estimated from data).
  
Yates’ Continuity Correction (for 1 DF, small ):
  
Assumptions
  
•	Expected frequencies in most cells (or ≥1 with no more than 20% <5).
  
•	Independent observations.
  
Worked Example (Coin): Your calc is correct. (, DF=1) → biased. With Yates: 8.41 still significant.
  
Test of Independence / Homogeneity (Important Addition)
  
Use for contingency tables (e.g., gender vs. preference).
  
DF = . Same formula.
  
When to Choose Chi-Square
  
•	Categorical data only.
  
•	Large sample sizes.
  
Alternatives: Fisher’s Exact Test (small ), G-test.
  
3. Student’s t-Tests
  
One-Sample
  
Independent Two-Sample (assume equal variance first)
  

  
Pooled variance:
  
Welch’s t-test (unequal variances – more robust):
  
with approximate DF (Satterthwaite).
  
Paired t-test
  
Assumptions (critical)
  
•	Normality of data (or of differences in paired). Central Limit Theorem helps for .
  
•	Independence of observations.
  
•	Equal variances (for pooled version) → test first with F-test.
  
Effect Size: Cohen’s (0.2 small, 0.5 medium, 0.8 large).
  
Common Pitfall: Using independent t-test on paired data (inflates Type II error).
  
4. One-Way ANOVA
  
Core Idea: Partition total variance into Between + Within.
  
Formulas (your notes are good):
  

  
Your Studying Methods Example (corrected interpretation):
  
Between SS = 0.2, Within SS = 28, (very small).
  
Fail to reject → no evidence methods differ.
  
Post-Hoc Tests (if significant): Tukey HSD, Bonferroni, Scheffé.
  
Effect Size: (proportion of variance explained).
  
Assumptions
  
•	Normality within groups.
  
•	Homogeneity of variances (Levene’s test).
  
•	Independence.
  
Two-Way ANOVA / Factorial (extension of your DOE section): Tests main effects + interaction.
  
Alternatives: Kruskal-Wallis (non-parametric), Welch ANOVA (unequal var).
  
5. Design of Experiments (DOE) – Basics & Additions
  
Full Factorial 2² Example (your table is excellent):
  
Exp	Temp	Pressure	Quality
  
1	Low	Low	70
  
2	Low	High	75
  
3	High	Low	80
  
4	High	High	90
  
Main Effects: Temp effect = (80+90)/2 - (70+75)/2 = 12.5
  
Pressure effect = (75+90)/2 - (70+80)/2 = 7.5
  
Interaction: Present if lines cross in intera

Quick Reference: Statistical Tests at a Glance

Test

Purpose

Data Type

Sample Size

Key Formula

F-Test

Compare variances

Continuous

Any

F = S₁²/S₂²

χ² (Chi-Square)

Categorical relationships

Categorical

Large

χ² = Σ(O-E)²/E

t-Test

Compare means (1 or 2)

Continuous

Small (n≤30)

t = (x̄ - μ)/(s/√n)

ANOVA

Compare 3+ means

Continuous

Any

F = MS_B/MS_W

DOE

Process optimization

Mixed

Planned

Factorial design

 

Test Selection Flowchart

Start → What is your research question?

•       Compare variances (2 groups) → F-Test or Levene's Test

•       Compare means (1 sample to known μ) → One-Sample t-Test

•       Compare means (2 independent groups) → Independent t-Test (Welch if unequal var)

•       Compare means (paired/before-after) → Paired t-Test

•       Compare means (3+ groups) → One-Way ANOVA + Post-Hoc Tests

•       Test categorical fit to expected → Chi-Square Goodness of Fit

•       Test association between categorical → Chi-Square Test of Independence

•       Violate assumptions? Small n? → Non-Parametric Alternatives

1. F-TEST (Variance Comparison)

Definition

The F-test compares variances of two independent samples using the F-distribution. It answers: Do two populations have significantly different spreads?

Core Formula

F = S₁²/S₂² (larger variance always in numerator → F ≥ 1)

Where S² = Σ(x

  • x̄)² / (n-1)

Assumptions

•       Both populations normally distributed

•       Samples are independent

•       Random sampling used

⚠ Warning: Sensitive to non-normality, especially with small samples.

Procedure

•       Step 1: State H₀: σ₁² = σ₂² (variances equal) vs H₁: σ₁² ≠ σ₂²

•       Step 2: Compute sample variances S₁² and S₂²

•       Step 3: Calculate F = larger/smaller

•       Step 4: Find critical value F_α(n₁-1, n₂-1) from F-table

•       Step 5: Decision → If F_calc ≥ F_table, reject H₀

Worked Example: Packaging Machine Precision

Two packaging machines, 10 samples each. Test if precision differs at α = 0.05.

Given: S₁² = 2.9709, S₂² = 0.4506, n₁ = n₂ = 10

F = 2.9709 / 0.4506 = 6.593

Critical value: F₀.₀₅(9,9) = 3.18

Since 6.593 > 3.18 → Reject H₀

Conclusion: Machines have significantly different precision.

Effect Size

•       F-ratio itself indicates effect size (e.g., F=6.6 means 6.6× variance difference)

•       Larger F → More significant difference in spread

Common Pitfalls

•       Using F-test on severely non-normal data → Consider Levene's or Brown-Forsythe

•       Forgetting to place larger variance in numerator

•       Wrong DOF in table lookup

Alternatives

•       Levene's Test (more robust to non-normality)

•       Brown-Forsythe Test (median-based, even more robust)

2. CHI-SQUARE (χ²) TEST

Definition

Chi-square tests the relationship between categorical variables. It answers: Do observed frequencies fit an expected distribution? Are two categorical variables associated?

Core Formula

χ² = Σ [(O - E)² / E]

Where O = observed frequency, E = expected frequency

Degrees of Freedom

•       Goodness of fit: DF = k - 1 (k = number of categories)

•       Independence test: DF = (r - 1)(c - 1) (r rows, c columns)

Assumptions

•       Expected frequencies E ≥ 5 in at least 80% of cells

•       Independent observations

•       Large sample sizes recommended

Procedure

•       Step 1: State H₀ (fit expected / no association) vs H₁

•       Step 2: Count observed frequencies O

•       Step 3: Calculate expected frequencies E

•       Step 4: Compute χ²_calc = Σ(O-E)²/E

•       Step 5: Compare χ²_calc with χ²_α(DF)

•       Step 6: If χ²_calc > χ²_table, reject H₀

Worked Example: Coin Bias Test

Coin tossed 100 times: 65 heads, 35 tails. Test fairness at α = 0.01.

Observed: O_H = 65, O_T = 35

Expected: E_H = 50, E_T = 50

χ² = (65-50)²/50 + (35-50)²/50 = 225/50 + 225/50 = 9.0

Critical: χ²₀.₀₁,₁ = 6.635

Since 9.0 > 6.635 → Reject H₀

Conclusion: Coin is biased.

Yates Continuity Correction

χ² = Σ [(|O - E| - 0.5)² / E]

Use for 1 DF when expected frequencies are small (< 10). Example: χ² = 8.41 (slightly less significant).

Common Pitfalls

•       Using chi-square with E < 5 → Violates assumptions

•       Forgetting the squared term (O-E)²

•       Confusing test with t-test (different data types!)

Alternatives

•       Fisher's Exact Test (small samples)

•       G-Test (log-likelihood ratio)

3. STUDENT'S t-TEST

Definition

The t-test compares means when sample sizes are small (n ≤ 30) and population variance is unknown. Developed by W.S. Gosset (pseudonym "Student").

Core Formulas

One-Sample t

t = (x̄ - μ) / (s / √n), DF = n - 1

Independent Two-Sample t (Equal Variance)

t = (x̄₁ - x̄₂) / (s_p √(1/n₁ + 1/n₂))

where s_p² = [(n₁-1)s₁² + (n₂-1)s₂²] / (n₁ + n₂ - 2)

Welch's t (Unequal Variance - Preferred)

t = (x̄₁ - x̄₂) / √(s₁²/n₁ + s₂²/n₂)

(Welch's DF computed via Satterthwaite approximation)

Paired t-Test

t = d̄ / (s_d / √n), where d = x₁ - x₂

Assumptions

•       Data normally distributed (or DF allow CLT)

•       Observations independent

•       Equal variances (for pooled version) → Test with F-test first

Acceptance Criterion

•       If |t_calc| > t_critical → Reject H₀

•       If |t_calc| ≤ t_critical → Accept H₀

Effect Size: Cohen's d

d = (x̄₁ - x̄₂) / s_p

•       d = 0.2 → Small effect

•       d = 0.5 → Medium effect

•       d = 0.8 → Large effect

Common Pitfalls

•       Using pooled t-test with unequal variances → Use Welch instead

•       Using independent t on paired data (violates independence)

•       Ignoring normality assumption

Non-Parametric Alternatives

•       One-sample: Wilcoxon Signed-Rank

•       Two-sample: Mann-Whitney U

•       Paired: Wilcoxon Signed-Rank

4. ONE-WAY ANOVA (Analysis of Variance)

Definition

ANOVA compares means of 3 or more groups. Developed by R.A. Fisher. It partitions total variance into between-group and within-group components.

Core Concept

SS_Total = SS_Between + SS_Within

Formulas

Between-Group Variance

SS_Between = Σ nᵢ (x̄ᵢ - x̄̄)²

Within-Group Variance

SS_Within = Σ Σ (xᵢⱼ - x̄ᵢ)²

F-Ratio

F = MS_Between / MS_Within = (SS_B/(k-1)) / (SS_W/(N-k))

where k = number of groups, N = total observations

Procedure

•       Step 1: Compute mean of each group (x̄₁, x̄₂, ..., x̄_k)

•       Step 2: Compute overall mean x̄̄

•       Step 3: Calculate SS_Between and SS_Within

•       Step 4: Compute MS values and F-ratio

•       Step 5: Compare F_calc with F_α(k-1, N-k)

•       Step 6: If F_calc > F_table, reject H₀

Worked Example: Study Methods (A, B, C)

10 students per method. Test if mean scores differ at α = 0.05.

Means: x̄_A = 8.7, x̄_B = 8.6, x̄_C = 8.5, x̄̄ = 8.6

ANOVA Table:

Source

SS

DF

MS

F

Between

0.2

2

0.1

0.096

Within

28.0

27

1.037

Total

28.2

29

 

F = 0.1 / 1.037 = 0.096 << F_0.05(2,27) ≈ 3.35

Decision: Fail to reject H₀ → No significant difference between methods.

Effect Size: Eta-Squared

η² = SS_Between / SS_Total

(Proportion of variance explained by group membership)

Post-Hoc Tests (if H₀ rejected)

•       Tukey HSD (most popular)

•       Bonferroni (conservative)

•       Scheffé (most flexible)

Assumptions

•       Normality within each group

•       Homogeneity of variances (test with Levene's)

•       Independence of observations

Common Pitfalls

•       Using ANOVA without checking homogeneity first

•       Not using post-hoc when groups differ significantly

•       Ignoring interaction effects in factorial designs

Alternatives

•       Kruskal-Wallis (non-parametric, ordinal data)

•       Welch ANOVA (unequal variances)

5. DESIGN OF EXPERIMENTS (DOE) BASICS

Purpose

Systematically vary factors to optimize process output. Common in engineering, manufacturing, agriculture.

Worked Example: Temperature × Pressure Factorial

Exp

Temperature

Pressure

Output Quality

1

Low

Low

70

2

Low

High

75

3

High

Low

80

4

High

High

90

 

Main Effects Analysis:

Temperature effect = (80+90)/2 - (70+75)/2 = 12.5

Pressure effect = (75+90)/2 - (70+80)/2 = 7.5

Best setting: High Temperature + High Pressure → Output 90

DOE Principles

•       Randomization: Reduces bias from unknown variables

•       Replication: Provides error estimates

•       Blocking: Controls nuisance factors

•       Factorial Design: Examines all factor combinations

•       Response Surface Methodology: Models continuous optimization

Common DOE Types

•       Full Factorial 2^k (all combinations)

•       Fractional Factorial (screening, fewer experiments)

•       Central Composite (curvature testing)

•       Taguchi (robust design, noise factors)

6. NON-PARAMETRIC ALTERNATIVES

When assumptions fail (non-normal, small n, ordinal data), use these:

Parametric Test

Non-Parametric Alternative

One-sample t

Wilcoxon Signed-Rank

Independent t

Mann-Whitney U

Paired t

Wilcoxon Signed-Rank

ANOVA

Kruskal-Wallis H

Correlation

Spearman Rank, Kendall τ

 

7. BEST PRACTICES & COMMON PITFALLS

Before Testing

•       ✓ Check normality (Shapiro-Wilk, Q-Q plots)

•       ✓ Check equal variance (Levene's test)

•       ✓ Verify independence

•       ✓ Plan sample size (power analysis)

While Testing

•       ✓ Use appropriate test for data type

•       ✓ Report confidence intervals (not just p-values)

•       ✓ Report effect size (Cohen's d, η², etc.)

•       ✓ Adjust for multiple comparisons (Bonferroni)

Interpretation Rules

•       p < α: Reject H₀ (statistically significant)

•       p ≥ α: Fail to reject H₀ (not significant)

•       p-value ≠ probability H₀ is true

•       Small p-value = strong evidence against H₀

Critical Pitfalls to Avoid

•       ❌ Relying only on p-values (ignoring effect size)

•       ❌ p-hacking / Multiple testing without correction

•       ❌ Using wrong test for data type

•       ❌ Assuming correlation = causation

•       ❌ Violating assumptions without sensitivity checks

8. FORMULA QUICK REFERENCE SHEET

Formulas for All Tests

Test

Formula

Critical Info

F-Test

F = S₁²/S₂²

DF = (n₁-1, n₂-1)

χ²

χ² = Σ(O-E)²/E

DF = k-1 or (r-1)(c-1)

One-Sample t

t = (x̄-μ)/(s/√n)

DF = n-1

Two-Sample t

t = (x̄₁-x̄₂)/(s_p√(1/n₁+1/n₂))

DF = n₁+n₂-2

ANOVA

F = MS_B/MS_W

DF = (k-1, N-k)

Cohen's d

d = (x̄₁-x̄₂)/s_p

0.2=small, 0.5=med, 0.8=large

 

Final Note for Exam Success

Remember: Each test answers a specific question about your data. Always:

•       Understand the question (what are you comparing?)

•       Check assumptions first

•       Choose the right test

•       Report effect size + confidence interval, not just p-value

•       Interpret in context (statistical significance ≠ practical significance)

Good luck with your M.Tech exams and viva! 🎓

SIMULATION EXPERIMENT LAB

SIMULATION TECHNIQUES: COMPLETE GUIDE (Quantitative Techniques in Management) TABLE OF CONTENTS Introduction to Simulation Fundamentals of...