Monday, 18 May 2026

F Test, Z test chi rest software lab

.PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method) are the two foundational mathematical models used in project management to plan, schedule, and control complex projects.

While CPM focuses on controlling fixed timelines and costs by identifying the longest sequence of dependent tasks, PERT managing timeline uncertainty by using three probabilistic time estimates for each activity.

Direct Comparison

FeatureCPM (Critical Path Method)PERT (Program Evaluation and Review Technique)
FocusTime-cost trade-offs and cost control.Time management and schedule unpredictability.
ApproachDeterministic (fixed, known time estimates).Probabilistic (handles highly uncertain task durations).
EstimatesOne single time estimate per activity.Three time estimates (Optimistic, Pessimistic, Most Likely).
Project TypeRepetitive, predictable work (e.g., Construction).One-off, non-repetitive work (e.g., R&D, Tech Launch).
Core ConceptIdentifies float/slack to optimize resource allocation.Calculates the probability of finishing by a specific date.

How They Work Together

In modern project management platforms, these techniques are merged. You use PERT concepts to estimate individual task durations when you lack certainty. You then input those finalized durations into a CPM-driven network engine to calculate your project's final deadline and identify critical tasks.

1. CPM: Finding the Critical Path

The Critical Path Method sequences tasks using early/late start and finish dates.
  • Critical Path: The longest continuous sequence of dependent tasks through the network.
  • Total Float (Slack): The amount of time a task can be delayed without delaying the final project deadline. Critical path tasks have zero float.

2. PERT: Calculating Expected Duration

When task durations are unpredictable, PERT applies a weighted average formula based on three scenarios:
  • Optimistic Time ($o$): The absolute minimum time if everything goes perfectly.
  • Pessimistic Time ($p$): The maximum time if everything goes wrong.
  • Most Likely Time ($m$): The realistic, standard duration under normal conditions.
$$\text{Expected Duration } (T_e) = \frac{o + 4m + p}{6}$$
$$\text{Standard Deviation } (\sigma) = \frac{p - o}{6}$$

Industry Application

In project management software like Oracle Primavera P6, CPM serves as the primary automated scheduling engine. When you click "Schedule", the software instantly executes a forward and backward pass calculation across all logic links to instantly highlight your Critical Path in red.
If you want to dive deeper, let me know if you would like me to solve a practice network diagram problem or explain how to configure these logic paths in project management software.

1. The Core Concepts

  • Normal Distribution: A symmetrical, bell-shaped curve where most values cluster around the central average (mean), and taper off equally toward both extremes.
  • Standard Deviation ($\sigma$): A number that measures how far, on average, individual data points stretch out or deviate from that central mean.

📊 The Empirical Rule (The 68-95-99.7 Rule)

In a perfect normal distribution, your standard deviations dictate exactly where your data points fall:
          ▲
          │      68%
          │  ┌─────────┐
          │  │   95%   │
          │┌─┼─────────┼─┐
          ││ │  99.7%  │ │
       ───┴┴─┼─┼─────┼─┼─┴┴───
            -3  -2  -1  μ  +1  +2  +3
             Standard Deviations
  • $\pm$1 Standard Deviation: 68% of all data falls within this range. These are your normal, everyday values.
  • $\pm$2 Standard Deviations: 95% of all data falls within this range.
  • $\pm$3 Standard Deviations: 99.7% of all data falls within this range. Anything outside this is an extreme anomaly or outlier.

💡 A Real-World Example: IQ Scores

Human IQ scores follow a normal distribution with an average (mean) of 100 and a standard deviation of 15.
Using the rules of "normal deviations," we can predict exactly how smart the population is:
  • Normal Range (68% of people): Score between 85 and 115 (100 $\pm$ 15).
  • High/Low Range (95% of people): Score between 70 and 130 (100 $\pm$ 30).
  • Genius / Extreme Range (99.7% of people): Score between 55 and 145 (100 $\pm$ 45). Anyone scoring above 145 is in the top 0.15% of the planet.

🔗 How This Connects to Your t-test

When you run a t-test, you are checking if the difference between your two group averages is wider than the "normal deviation" expected by pure random chance. If your calculated difference pushes your result far out into the 2nd or 3rd standard deviation zone, your p-value drops below 0.05, proving your result is significant.
If you are working on a specific problem, tell me:
  • What is your mean and standard deviation?
  • What specific percentage or data point are you trying to find?
I can help you calculate the exact Z-score to find your answer
Let's analyze a real-world scientific dataset to show how Normal Distribution, Standard Deviation, and Z-scores provide actionable evidence for decision-making.

📋 The Real-World Dataset: Adult Human Heights

According to global health metrics, the height of adult human males follows a highly predictable normal distribution curve. [1, 2]
  • Population Mean ($\mu$): 175 cm (approx. 5'9")
  • Standard Deviation ($\sigma$): 7 cm
  • Target Value ($X$): A specific athlete measuring 191 cm (approx. 6'3"). [3, 4]

🧮 Step 1: Calculate the Z-Score

The Z-score tells us exactly how many standard deviations our target value lies away from the average. [4, 5]
$$\text{Z-Score Formula: } Z = \frac{X - \mu}{\sigma}$$
[4, 6]
$$Z = \frac{191 - 175}{7} = \frac{16}{7} \approx \mathbf{2.29}$$
[6, 7]
Fact: The athlete's height has a positive Z-score of 2.29, meaning they are 2.29 standard deviations taller than the average male population. [5, 8]

🔍 Step 2: Convert to Evidence (Using the Z-Table) [9]

By looking up a Z-score of 2.29 in a standard standard normal distribution table, we find the cumulative probability: [4, 9]
  • Area to the left (Percentile): 0.9890
  • Data Fact: This means our athlete is taller than 98.9% of all adult men globally.
  • The Remainder: Only 1.1% of the world's male population ($100\% - 98.9\%$) matches or exceeds this height. [9, 10, 11]

📊 Visualizing the Evidence (The Bell Curve)

The plot below highlights where this specific data point falls. Notice how the height of 191 cm sits far out into the right tail, past the 2nd standard deviation line. [7, 8]

💡 Why This Evidence Matters in the Real World

  1. Manufacturing & Ergonomics: Car manufacturers use this specific data. Designing a car seat that only fits up to 2 standard deviations ($175 + 14 = 189\text{ cm}$) means safely accommodating 97.5% of men. Our 191 cm athlete would find the vehicle's legroom or headroom cramped.
  2. Medical Diagnostics: Doctors track child growth curves using these exact normal deviations. If a child’s height deviates past a Z-score of $\pm3.0$ (the extreme 0.3% of the world), it flags an immediate medical review for potential growth hormone deficiencies or abnormalities. [1, 5, 12, 13, 14]
  3.  business metric, such as evaluating manufacturing defect rates (Six Sigma) or analyzing financial market anomalies?

📋 The Scenario

Imagine you train 5 employees to speed up their data entry. You measure their typing speeds (words per minute) Before the training and After the training.
EmployeeBefore ($X_1$)After ($X_2$)Difference ($d = X_2 - X_1$)$d^2$
14045525
23542749
3505000
44248636
53845749
Total ($\Sigma$)$\Sigma d = 25$$\Sigma d^2 = 159$
  • Number of participants ($n$) = 5

🧮 Step-by-Step Calculation

Step 1: Find the Mean of the Differences ($\bar{d}$)

$$\bar{d} = \frac{\Sigma d}{n} = \frac{25}{5} = 5$$

Step 2: Calculate the Standard Deviation of the Differences ($s_d$)

Use the standard deviation formula for differences:
$$s_d = \sqrt{\frac{\Sigma d^2 - \frac{(\Sigma d)^2}{n}}{n - 1}}$$
$$s_d = \sqrt{\frac{159 - \frac{(25)^2}{5}}{5 - 1}} = \sqrt{\frac{159 - 125}{4}} = \sqrt{\frac{34}{4}} = \sqrt{8.5} \approx 2.915$$

Step 3: Compute the t-statistic ($t$)

The formula for a paired t-test is:
$$t = \frac{\bar{d}}{\frac{s_d}{\sqrt{n}}}$$
$$t = \frac{5}{\frac{2.915}{\sqrt{5}}} = \frac{5}{\frac{2.915}{2.236}} = \frac{5}{1.304} \approx \mathbf{3.83}$$
Our calculated t-value is 3.83.

🔑 Decision Time: Is it Significant?

To see if 3.83 is a strong enough number, we compare it to a standard Critical Value Table using two metrics:
  1. Degrees of Freedom ($df$): $n - 1 = 5 - 1 = \mathbf{4}$
  2. Confidence Level: Standard 95% ($\alpha = 0.05$)
Looking up $df = 4$ at $\alpha = 0.05$ (two-tailed) in a standard t-distribution table gives a Critical Value of 2.776.
  • The Rule: If your calculated $t$-value is greater than the Critical Value, your results are statistically significant.
  • Conclusion: Since 3.83 > 2.776, the training program genuinely and significantly improved typing speeds. It was not a fluke.
If you want to practice further, let me know if you would like to try a One-Sample t-test or explore how to handle outliers in your data before testing.

🎯 Aims & Objectives of hypothesis

  • Compares Two Averages: It checks if the average score of Group A is truly different from the average score of Group B.
  • Proves Cause and Effect: It helps you understand if a specific change or treatment actually caused a different result.
  • Tests a Hypothesis: It rejects or accepts your theory (Null Hypothesis) by calculating a p-value. If the p-value is less than 0.05, the difference is mathematically real.

💡 Simple Examples of What It Finds Out

  • In Business: Does a new website design (Group A) generate more sales than the old website design (Group B)?
  • In Medicine: Does a new medicine (Group A) lower blood pressure better than a dummy pill/placebo (Group B)?
  • In Education: Do students who take online classes (Group A) score higher or lower than students in physical classrooms (Group B)?

📊 The 3 Main Types of t-tests

Depending on what you want to find out, you use one of three types:
  1. Independent Samples t-test: Compares two completely separate, unrelated groups (e.g., comparing the average salaries of Men vs. Women).
  2. Paired Samples t-test: Compares the same group at two different times (e.g., testing students' scores Before training vs. After training).
  3. One-Sample t-test: Compares the average of one group against a known standard number (e.g., checking if the average weight of a cereal box is truly 500 grams).
If you want to practice running this test, tell me:
  • Do you have a specific dataset or problem you are trying to solve?
  • Which software are you using (Excel, SPSS, R, or Python)?.
 
The p-value (Probability Value) is a number between 0 and 1 that tells you if your test results happened by pure luck, or if they are mathematically real.
In simple terms, it answers the question: "Is my data just a coincidence?"

🚦 The Golden Rule of p-values

To make a decision in statistics, you always compare your p-value to a benchmark called alpha ($\alpha$), which is almost always set to 0.05 (5%).
  • If p-value is $\le$ 0.05 (Low): The result is statistically significant.
    • What it means: The difference is real. Your training program, medicine, or new website design actually worked.
  • If p-value is > 0.05 (High): The result is not statistically significant.
    • What it means: The difference is just a fluke or random noise. Any small change you see happened by pure chance.

🧠 A Real-World Analogy: The Coin Toss

Imagine a friend claims they have a magic ability to always flip "Heads".
  • Scenario A: They flip a coin 3 times and get Heads 3 times.
    • The p-value is roughly 0.125 (12.5%).
    • Since 0.125 is greater than 0.05, you say, "Anyone can get 3 heads in a row by pure luck. Prove it more."
  • Scenario B: They flip a coin 20 times and get Heads 20 times.
    • The p-value is less than 0.000001.
    • Since this is way below 0.05, you conclude, "The odds of this happening by chance are near zero. The coin is rigged or you actually have a trick."

🔗 How it connects to our last example

In the manual typing speed test we just calculated, our calculated $t$-value was 3.83. If we convert that $t$-value into a p-value using a software tool or statistical table, the exact p-value is 0.0186.
  • Interpretation: Because 0.0186 is less than 0.05, there is only a 1.86% chance that the employees got faster by luck. Therefore, you confidently conclude the training program genuinely worked.


📋 The Dataset

Imagine you collected sales data ($) from 5 days of the old design and 5 days of the new design:
  • Old Design (Group A): $100, $120, $110, $130, $115 (Average = $115)
  • New Design (Group B): $130, $145, $135, $150, $140 (Average = $140)
Looking at the averages ($115 vs $140), Design B looks better. But is this difference statistically significant, or just a lucky streak? Let’s find out.

💻 Choose Your Tool (Steps for Excel & Python)

You can run this instantly using whichever tool you prefer:

Option 1: In Microsoft Excel

  1. Type the Old Design data into column A (cells A1 to A5).
  2. Type the New Design data into column B (cells B1 to B5).
  3. In any empty cell, type this exact formula:
    =T.TEST(A1:A5, B1:B5, 2, 2)
  4. Press Enter. This will instantly give you your p-value.

Option 2: In Python

If you are coding, you can copy and paste this simple script:
import scipy.stats as stats

# Input data
old_design = [100, 120, 110, 130, 115]
new_design = [130, 145, 135, 150, 140]

# Run Independent t-test
t_stat, p_value = stats.ttest_ind(old_design, new_design)

print(f"t-statistic: {t_stat:.4f}")
print(f"p-value: {p_value:.4f}")

🔑 How to Read the Output (Result Interpretation)

When you run the test using either method above, you will get a p-value of approximately 0.004.
Here is how to interpret that result:
  • The Rule: In statistics, if the p-value is less than 0.05, the result is statistically significant.
  • Your Conclusion: Since 0.004 is much less than 0.05, you reject the idea that this happened by chance. The New Website Design objectively, significantly increases sales.
Would you like to try this with a Paired t-test (Before vs After data), or see how to calculate a t-test completely by hand using formulas?

  1. F-Test (Snedecor’s F-Distribution)

Definition

The F-distribution is a sampling distribution used to compare the variances of two independent samples. If

· X has a chi-square distribution with d_1 degrees of freedom (DOF),

· Y has a chi-square distribution with d_2 DOF,

then

F = \frac{X/d_1}{Y/d_2} \quad \text{follows an F-distribution with } (d_1, d_2) \text{ DOF}.

For two independent samples from normal populations with the same variance:

F = \frac{S_1^2}{S_2^2} = \frac{\sum_{i=1}^{n_1} (x_i - \bar{x}1)^2 / (n_1 - 1)}{\sum{j=1}^{n_2} (y_j - \bar{y}_2)^2 / (n_2 - 1)}

Rule: The larger variance is always placed in the numerator → F \ge 1.

Procedure for F-Test

  1. Null hypothesis H_0: \sigma_1^2 = \sigma_2^2 (no significant difference between variances).

  2. Alternative hypothesis H_a (one- or two-tailed as per problem).

  3. Compute sample means:

    \bar{x}_1 = \frac{\sum x_1}{n_1}, \quad \bar{x}_2 = \frac{\sum x_2}{n_2}

    ]

  4. Compute sample variances S_1^2 and S_2^2:

    S_1^2 = \frac{\sum (x_i - \bar{x}_1)^2}{n_1 - 1}, \quad S_2^2 = \frac{\sum (x_j - \bar{x}_2)^2}{n_2 - 1}

    ]

    (If variances are given directly, use them.)

  5. Calculate F_c = \frac{\text{larger variance}}{\text{smaller variance}}.

  6. Compare with F-table value at given \alpha and DOF (n_1-1, n_2-1).

Acceptance criterion:

· If F_c < F_{\text{table}} → Accept H_0 (variances are equal).

· If F_c \ge F_{\text{table}} → Reject H_0 (variances differ significantly).

Worked Example – Packaging Machine Weights

Data: Two machines A and B, each with 10 packs. Nominal weight should be consistent.

Given data (corrected from PDF):

Machine A 50.8 51.0 49.5 52.1 51.8 41.4 51.5 49.0 48.0 –

Actually from PDF: Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0, and one more? Let's reconstruct properly.

From pages 4-5:

Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0? Incomplete. But the calculation in PDF used n_1=10 and got mean 49.93. We'll trust the calculation.

Given in PDF:

\bar{x}_1 = 49.93, \bar{x}_2 = 49.03

S_1^2 = 2.9709, S_2^2 = 0.4506

F_c = \frac{2.9709}{0.4506} = 6.5932

]

DOF = (9, 9), \alpha = 0.05, F_{\text{table}} = 3.18

Since 6.5932 > 3.18 → Reject H_0. Conclude machines have significantly different variances.


  1. Chi-Square (\chi^2) Test – Goodness of Fit

Definition

Used for categorical variables to test how well observed data fit an expected distribution.

\chi^2 = \sum \frac{(O - E)^2}{E}

]

Where O = observed frequency, E = expected frequency.

Properties

· Only positive values, skewed right.

· Family of distributions indexed by degrees of freedom (DF).

· DF = k - 1 (where k = number of categories).

Acceptance Criteria (at significance level \alpha)

· If \chi^2_{\text{stat}} > \chi^2_{\text{critical}}(\alpha, k-1) → Reject H_0.

· If \chi^2_{\text{stat}} \le \chi^2_{\text{critical}} → Accept H_0 (or fail to reject).

Worked Example – Coin Toss

A coin tossed 100 times, heads observed 65 times. Test bias at \alpha = 0.01.

Hypotheses:

H_0: Coin is fair (Heads = Tails = 50)

H_a: Coin is biased

Observed: O_H = 65, O_T = 35

Expected: E_H = 50, E_T = 50

\chi^2 = \frac{(65-50)^2}{50} + \frac{(35-50)^2}{50} = \frac{225}{50} + \frac{225}{50} = 4.5 + 4.5 = 9

]

With Yates’ correction (for small expected frequencies sometimes, but here n large):

PDF shows a correction term -0.5 inside numerator:

\frac{(65-50-0.5)^2}{50} + \frac{(35-50+0.5)^2}{50} = \frac{(14.5)^2}{50} + \frac{(-14.5)^2}{50} = \frac{210.25}{50} \times 2 = 8.41

]

Critical value: \chi^2_{0.01, 1} = 6.635

Since 9 > 6.635 (or 8.41 > 6.635) → Reject H_0. Coin is biased.


  1. Student’s t-Distribution

Definition

Used when sample size is small (n \le 30) and population variance \sigma is unknown. Developed by W.S. Gosset (pseudonym “Student”).

t = \frac{\bar{x} - \mu}{S / \sqrt{n}}, \quad \text{where } S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2

· \bar{x} = sample mean, \mu = population mean, n = sample size, S = sample standard deviation.

Properties

· Ranges from -\infty to +\infty.

· Bell-shaped, symmetric about 0, but heavier tails than normal.

· DOF = n - 1.

· Used when population standard deviation unknown.

Types of t-Tests

  1. One-sample t-test – compares sample mean to a known population mean.

  2. Independent two-sample t-test – compares means of two independent groups.

    t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}}

    ]

  3. Paired t-test – compares two related samples (e.g., before and after).

Acceptance Criteria

· If |t_{\text{calc}}| > t_{\text{critical}} → Reject H_0.

· If |t_{\text{calc}}| \le t_{\text{critical}} → Accept H_0.


  1. ANOVA – Analysis of Variance

Definition

Compares means of more than two populations simultaneously. Developed by R.A. Fisher.

Example uses:

· Yield of crop from several seed varieties.

· Smoking habits across multiple groups.

· Gasoline mileage of different automobiles.

Procedure (One-Way ANOVA)

  1. Compute mean of each sample: \bar{x}_1, \bar{x}_2, \dots, \bar{x}_k.

  2. Compute overall mean: \bar{\bar{x}} = \frac{\sum \bar{x}_i}{k} (weighted by sample sizes if unequal).

  3. Variance between groups (treatment variance):

    SS_{\text{between}} = \sum_{i=1}^{k} n_i (\bar{x}_i - \bar{\bar{x}})^2

    ]

  4. Variance within groups (error variance):

    SS_{\text{within}} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (x_{ij} - \bar{x}_i)^2

    ]

  5. Compute F = \frac{MS_{\text{between}}}{MS_{\text{within}}}, where MS = SS/DF.

  6. Compare with F-table (DOF between = k-1, DOF within = N-k).

Worked Example – Studying Methods

Three methods (A, B, C), each with 10 students. Test if mean scores differ.

Data summary (from PDF):

Method A mean = 8.7, B mean = 8.6, C mean = 8.5, overall mean = 8.6.

Between-group variance:

10(8.7-8.6)^2 + 10(8.6-8.6)^2 + 10(8.5-8.6)^2 = 10(0.01) + 0 + 10(0.01) = 0.2

]

Within-group variance (sum of squared deviations inside each method):

Given in PDF: SS_A = 6.6, SS_B = 10.9, SS_C = 10.5 → Total SS_{\text{within}} = 28.0

ANOVA table:

Source SS DF MS F

Between 0.2 2 0.1 0.1/0.966 ≈ 0.1035

Within 28.0 27 1.037

Total 28.2 29

Wait, correction: MS_{\text{within}} = 28/27 ≈ 1.037. Then F = 0.1 / 1.037 ≈ 0.096. PDF says 0.0071? Possibly miscalculation. But the interpretation: F is very small (<1), so no significant difference between methods.

Acceptance: If F_{\text{calc}} < F_{\text{critical}}, accept H_0 (all means equal).


  1. Design of Experiments (DOE) – Simple Factorial

Example Table (2 Factors)

Experiment No Temperature (°C) Pressure (Bar) Output Quality

1 Low Low 70

2 Low High 75

3 High Low 80

4 High High 90

Conclusion: High temperature and high pressure give the best output quality.


Summary Diagram of Statistical Test Selection

  
                             ┌─────────────────────┐
  
                             │  What is your goal? │
  
                             └──────────┬──────────┘
  
                                        │
  
            ┌───────────────────────────┼───────────────────────────┐
  
            │                           │                           │
  
            ▼                           ▼                           ▼
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐
  
   │ Compare variance│        │ Compare means   │        │ Compare means   │
  
   │ of 2 groups     │        │ of 1 group to   │        │ of >2 groups    │
  
   │                 │        │ known value     │        │                 │
  
   └────────┬────────┘        └────────┬────────┘        └────────┬────────┘
  
            │                          │                          │
  
            ▼                          ▼                          ▼
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐
  
   │    F-test       │        │  One-sample     │        │   ANOVA         │
  
   │                 │        │  t-test         │        │  (F-test)       │
  
   └─────────────────┘        └─────────────────┘        └─────────────────┘
  

  
   For categorical data (goodness of fit) → Chi-square test
  

  
Sub section 1.2
  
 
  
Statistical Tests – Integrated Notes
  
1.	F-Test (Snedecor’s F-Distribution)
  
Definition
  
The F-distribution is a sampling distribution used to compare the variances of two independent samples. If
  
· X has a chi-square distribution with d_1 degrees of freedom (DOF),
  
· Y has a chi-square distribution with d_2 DOF,
  
then
  
F = \frac{X/d_1}{Y/d_2} \quad \text{follows an F-distribution with } (d_1, d_2) \text{ DOF}.
  
For two independent samples from normal populations with the same variance:
  
F = \frac{S_1^2}{S_2^2} = \frac{\sum_{i=1}^{n_1} (x_i - \bar{x}1)^2 / (n_1 - 1)}{\sum{j=1}^{n_2} (y_j - \bar{y}_2)^2 / (n_2 - 1)}
  
Rule: The larger variance is always placed in the numerator → F \ge 1.
  
Procedure for F-Test
  
1.	Null hypothesis H_0: \sigma_1^2 = \sigma_2^2 (no significant difference between variances).
  
2.	Alternative hypothesis H_a (one- or two-tailed as per problem).
  
3.	Compute sample means:
  
\bar{x}_1 = \frac{\sum x_1}{n_1}, \quad \bar{x}_2 = \frac{\sum x_2}{n_2}
  
]
  
4.	Compute sample variances S_1^2 and S_2^2:
  
S_1^2 = \frac{\sum (x_i - \bar{x}_1)^2}{n_1 - 1}, \quad S_2^2 = \frac{\sum (x_j - \bar{x}_2)^2}{n_2 - 1}
  
]
  
(If variances are given directly, use them.)
  
5.	Calculate F_c = \frac{\text{larger variance}}{\text{smaller variance}}.
  
6.	Compare with F-table value at given \alpha and DOF (n_1-1, n_2-1).
  
Acceptance criterion:
  
· If F_c < F_{\text{table}} → Accept H_0 (variances are equal).
  
· If F_c \ge F_{\text{table}} → Reject H_0 (variances differ significantly).
  
Worked Example – Packaging Machine Weights
  
Data: Two machines A and B, each with 10 packs. Nominal weight should be consistent.
  
Given data (corrected from PDF):
  
Machine A 50.8 51.0 49.5 52.1 51.8 41.4 51.5 49.0 48.0 –
  
Actually from PDF: Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0, and one more? Let's reconstruct properly.
  
From pages 4-5:
  
Machine A: 50.8, 51, 49.5, 52.1, 51.8, 41.4, 51.5, 49.0, 48.0? Incomplete. But the calculation in PDF used n_1=10 and got mean 49.93. We'll trust the calculation.
  
Given in PDF:
  
\bar{x}_1 = 49.93, \bar{x}_2 = 49.03
  
S_1^2 = 2.9709, S_2^2 = 0.4506
  
F_c = \frac{2.9709}{0.4506} = 6.5932
  
]
  
DOF = (9, 9), \alpha = 0.05, F_{\text{table}} = 3.18
  
Since 6.5932 > 3.18 → Reject H_0. Conclude machines have significantly different variances.
  
 
  
2.	Chi-Square (\chi^2) Test – Goodness of Fit
  
Definition
  
Used for categorical variables to test how well observed data fit an expected distribution.
  
\chi^2 = \sum \frac{(O - E)^2}{E}
  
]
  
Where O = observed frequency, E = expected frequency.
  
Properties
  
· Only positive values, skewed right.
  
· Family of distributions indexed by degrees of freedom (DF).
  
· DF = k - 1 (where k = number of categories).
  
Acceptance Criteria (at significance level \alpha)
  
· If \chi^2_{\text{stat}} > \chi^2_{\text{critical}}(\alpha, k-1) → Reject H_0.
  
· If \chi^2_{\text{stat}} \le \chi^2_{\text{critical}} → Accept H_0 (or fail to reject).
  
Worked Example – Coin Toss
  
A coin tossed 100 times, heads observed 65 times. Test bias at \alpha = 0.01.
  
Hypotheses:
  
H_0: Coin is fair (Heads = Tails = 50)
  
H_a: Coin is biased
  
Observed: O_H = 65, O_T = 35
  
Expected: E_H = 50, E_T = 50
  
\chi^2 = \frac{(65-50)^2}{50} + \frac{(35-50)^2}{50} = \frac{225}{50} + \frac{225}{50} = 4.5 + 4.5 = 9
  
]
  
With Yates’ correction (for small expected frequencies sometimes, but here n large):
  
PDF shows a correction term -0.5 inside numerator:
  
\frac{(65-50-0.5)^2}{50} + \frac{(35-50+0.5)^2}{50} = \frac{(14.5)^2}{50} + \frac{(-14.5)^2}{50} = \frac{210.25}{50} \times 2 = 8.41
  
]
  
Critical value: \chi^2_{0.01, 1} = 6.635
  
Since 9 > 6.635 (or 8.41 > 6.635) → Reject H_0. Coin is biased.
  
 
  
3.	Student’s t-Distribution
  
Definition
  
Used when sample size is small (n \le 30) and population variance \sigma is unknown. Developed by W.S. Gosset (pseudonym “Student”).
  
t = \frac{\bar{x} - \mu}{S / \sqrt{n}}, \quad \text{where } S^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2
  
· \bar{x} = sample mean, \mu = population mean, n = sample size, S = sample standard deviation.
  
Properties
  
· Ranges from -\infty to +\infty.
  
· Bell-shaped, symmetric about 0, but heavier tails than normal.
  
· DOF = n - 1.
  
· Used when population standard deviation unknown.
  
Types of t-Tests
  
1.	One-sample t-test – compares sample mean to a known population mean.
  
2.	Independent two-sample t-test – compares means of two independent groups.
  
t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}}
  
]
  
3.	Paired t-test – compares two related samples (e.g., before and after).
  
Acceptance Criteria
  
· If |t_{\text{calc}}| > t_{\text{critical}} → Reject H_0.
  
· If |t_{\text{calc}}| \le t_{\text{critical}} → Accept H_0.
  
 
  
4.	ANOVA – Analysis of Variance
  
Definition
  
Compares means of more than two populations simultaneously. Developed by R.A. Fisher.
  
Example uses:
  
· Yield of crop from several seed varieties.
  
· Smoking habits across multiple groups.
  
· Gasoline mileage of different automobiles.
  
Procedure (One-Way ANOVA)
  
1.	Compute mean of each sample: \bar{x}_1, \bar{x}_2, \dots, \bar{x}_k.
  
2.	Compute overall mean: \bar{\bar{x}} = \frac{\sum \bar{x}_i}{k} (weighted by sample sizes if unequal).
  
3.	Variance between groups (treatment variance):
  
SS_{\text{between}} = \sum_{i=1}^{k} n_i (\bar{x}_i - \bar{\bar{x}})^2
  
]
  
4.	Variance within groups (error variance):
  
SS_{\text{within}} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (x_{ij} - \bar{x}_i)^2
  
]
  
5.	Compute F = \frac{MS_{\text{between}}}{MS_{\text{within}}}, where MS = SS/DF.
  
6.	Compare with F-table (DOF between = k-1, DOF within = N-k).
  
Worked Example – Studying Methods
  
Three methods (A, B, C), each with 10 students. Test if mean scores differ.
  
Data summary (from PDF):
  
Method A mean = 8.7, B mean = 8.6, C mean = 8.5, overall mean = 8.6.
  
Between-group variance:
  
10(8.7-8.6)^2 + 10(8.6-8.6)^2 + 10(8.5-8.6)^2 = 10(0.01) + 0 + 10(0.01) = 0.2
  
]
  
Within-group variance (sum of squared deviations inside each method):
  
Given in PDF: SS_A = 6.6, SS_B = 10.9, SS_C = 10.5 → Total SS_{\text{within}} = 28.0
  
ANOVA table:
  
Source SS DF MS F
  
Between 0.2 2 0.1 0.1/0.966 ≈ 0.1035
  
Within 28.0 27 1.037
  
Total 28.2 29
  
Wait, correction: MS_{\text{within}} = 28/27 ≈ 1.037. Then F = 0.1 / 1.037 ≈ 0.096. PDF says 0.0071? Possibly miscalculation. But the interpretation: F is very small (<1), so no significant difference between methods.
  
Acceptance: If F_{\text{calc}} < F_{\text{critical}}, accept H_0 (all means equal).
  
 
  
5.	Design of Experiments (DOE) – Simple Factorial
  
Example Table (2 Factors)
  
Experiment No Temperature (°C) Pressure (Bar) Output Quality
  
1 Low Low 70
  
2 Low High 75
  
3 High Low 80
  
4 High High 90
  
Conclusion: High temperature and high pressure give the best output quality.
  
 
  
Summary Diagram of Statistical Test Selection
  
                             ┌─────────────────────┐  
  
                             │  What is your goal? │  
  
                             └──────────┬──────────┘  
  
                                        │  
  
            ┌───────────────────────────┼───────────────────────────┐  
  
            │                           │                           │  
  
            ▼                           ▼                           ▼  
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐  
  
   │ Compare variance│        │ Compare means   │        │ Compare means   │  
  
   │ of 2 groups     │        │ of 1 group to   │        │ of >2 groups    │  
  
   │                 │        │ known value     │        │                 │  
  
   └────────┬────────┘        └────────┬────────┘        └────────┬────────┘  
  
            │                          │                          │  
  
            ▼                          ▼                          ▼  
  
   ┌─────────────────┐        ┌─────────────────┐        ┌─────────────────┐  
  
   │    F-test       │        │  One-sample     │        │   ANOVA         │  
  
   │                 │        │  t-test         │        │  (F-test)       │  
  
   └─────────────────┘        └─────────────────┘        └─────────────────┘  
  
  
  
   For categorical data (goodness of fit) → Chi-square test  
  
 
  
Let me know if you would like these notes converted into a PDF, flashcards, or a presentation.Enhanced Statistical Tests – Integrated Study Notes
  
Below is your reorganized content with my additions: key assumptions, when to use, limitations, additional formulas/variations, effect size interpretations, common pitfalls, and non-parametric alternatives where relevant. I’ve also corrected minor inconsistencies (e.g., ANOVA calculations) and added practical insights from standard statistical practice.
  
1. F-Test (Variance Comparison)
  
Core Formula
  
If and , then
  
For samples:
  
(larger variance in numerator → )
  
Key Assumptions
  
•	Populations are normally distributed.
  
•	Samples are independent.
  
•	Robust to moderate non-normality for large samples, but sensitive with small .
  
Procedure Additions
  
•	Always use upper-tail critical value when larger variance is in numerator.
  
•	For two-tailed test: compare to or use appropriately.
  
•	Effect size: Variance ratio itself (e.g., means ~6.6× more variable).
  
Worked Example (Packaging Machines) – Your values check out:
  
, , , 
  
→ Reject . Machines have significantly different precision.
  
Pitfall: Do not use F-test on non-normal data (especially heavy tails). Consider Levene’s or Brown-Forsythe test instead.
  
2. Chi-Square () Tests
  
Goodness-of-Fit
  

  
DF = (or if parameters estimated from data).
  
Yates’ Continuity Correction (for 1 DF, small ):
  
Assumptions
  
•	Expected frequencies in most cells (or ≥1 with no more than 20% <5).
  
•	Independent observations.
  
Worked Example (Coin): Your calc is correct. (, DF=1) → biased. With Yates: 8.41 still significant.
  
Test of Independence / Homogeneity (Important Addition)
  
Use for contingency tables (e.g., gender vs. preference).
  
DF = . Same formula.
  
When to Choose Chi-Square
  
•	Categorical data only.
  
•	Large sample sizes.
  
Alternatives: Fisher’s Exact Test (small ), G-test.
  
3. Student’s t-Tests
  
One-Sample
  
Independent Two-Sample (assume equal variance first)
  

  
Pooled variance:
  
Welch’s t-test (unequal variances – more robust):
  
with approximate DF (Satterthwaite).
  
Paired t-test
  
Assumptions (critical)
  
•	Normality of data (or of differences in paired). Central Limit Theorem helps for .
  
•	Independence of observations.
  
•	Equal variances (for pooled version) → test first with F-test.
  
Effect Size: Cohen’s (0.2 small, 0.5 medium, 0.8 large).
  
Common Pitfall: Using independent t-test on paired data (inflates Type II error).
  
4. One-Way ANOVA
  
Core Idea: Partition total variance into Between + Within.
  
Formulas (your notes are good):
  

  
Your Studying Methods Example (corrected interpretation):
  
Between SS = 0.2, Within SS = 28, (very small).
  
Fail to reject → no evidence methods differ.
  
Post-Hoc Tests (if significant): Tukey HSD, Bonferroni, Scheffé.
  
Effect Size: (proportion of variance explained).
  
Assumptions
  
•	Normality within groups.
  
•	Homogeneity of variances (Levene’s test).
  
•	Independence.
  
Two-Way ANOVA / Factorial (extension of your DOE section): Tests main effects + interaction.
  
Alternatives: Kruskal-Wallis (non-parametric), Welch ANOVA (unequal var).
  
5. Design of Experiments (DOE) – Basics & Additions
  
Full Factorial 2² Example (your table is excellent):
  
Exp	Temp	Pressure	Quality
  
1	Low	Low	70
  
2	Low	High	75
  
3	High	Low	80
  
4	High	High	90
  
Main Effects: Temp effect = (80+90)/2 - (70+75)/2 = 12.5
  
Pressure effect = (75+90)/2 - (70+80)/2 = 7.5
  
Interaction: Present if lines cross in intera

Quick Reference: Statistical Tests at a Glance

Test

Purpose

Data Type

Sample Size

Key Formula

F-Test

Compare variances

Continuous

Any

F = S₁²/S₂²

χ² (Chi-Square)

Categorical relationships

Categorical

Large

χ² = Σ(O-E)²/E

t-Test

Compare means (1 or 2)

Continuous

Small (n≤30)

t = (x̄ - μ)/(s/√n)

ANOVA

Compare 3+ means

Continuous

Any

F = MS_B/MS_W

DOE

Process optimization

Mixed

Planned

Factorial design

 

Test Selection Flowchart

Start → What is your research question?

•       Compare variances (2 groups) → F-Test or Levene's Test

•       Compare means (1 sample to known μ) → One-Sample t-Test

•       Compare means (2 independent groups) → Independent t-Test (Welch if unequal var)

•       Compare means (paired/before-after) → Paired t-Test

•       Compare means (3+ groups) → One-Way ANOVA + Post-Hoc Tests

•       Test categorical fit to expected → Chi-Square Goodness of Fit

•       Test association between categorical → Chi-Square Test of Independence

•       Violate assumptions? Small n? → Non-Parametric Alternatives

1. F-TEST (Variance Comparison)

Definition

The F-test compares variances of two independent samples using the F-distribution. It answers: Do two populations have significantly different spreads?

Core Formula

F = S₁²/S₂² (larger variance always in numerator → F ≥ 1)

Where S² = Σ(x

  • x̄)² / (n-1)

Assumptions

•       Both populations normally distributed

•       Samples are independent

•       Random sampling used

⚠ Warning: Sensitive to non-normality, especially with small samples.

Procedure

•       Step 1: State H₀: σ₁² = σ₂² (variances equal) vs H₁: σ₁² ≠ σ₂²

•       Step 2: Compute sample variances S₁² and S₂²

•       Step 3: Calculate F = larger/smaller

•       Step 4: Find critical value F_α(n₁-1, n₂-1) from F-table

•       Step 5: Decision → If F_calc ≥ F_table, reject H₀

Worked Example: Packaging Machine Precision

Two packaging machines, 10 samples each. Test if precision differs at α = 0.05.

Given: S₁² = 2.9709, S₂² = 0.4506, n₁ = n₂ = 10

F = 2.9709 / 0.4506 = 6.593

Critical value: F₀.₀₅(9,9) = 3.18

Since 6.593 > 3.18 → Reject H₀

Conclusion: Machines have significantly different precision.

Effect Size

•       F-ratio itself indicates effect size (e.g., F=6.6 means 6.6× variance difference)

•       Larger F → More significant difference in spread

Common Pitfalls

•       Using F-test on severely non-normal data → Consider Levene's or Brown-Forsythe

•       Forgetting to place larger variance in numerator

•       Wrong DOF in table lookup

Alternatives

•       Levene's Test (more robust to non-normality)

•       Brown-Forsythe Test (median-based, even more robust)

2. CHI-SQUARE (χ²) TEST

Definition

Chi-square tests the relationship between categorical variables. It answers: Do observed frequencies fit an expected distribution? Are two categorical variables associated?

Core Formula

χ² = Σ [(O - E)² / E]

Where O = observed frequency, E = expected frequency

Degrees of Freedom

•       Goodness of fit: DF = k - 1 (k = number of categories)

•       Independence test: DF = (r - 1)(c - 1) (r rows, c columns)

Assumptions

•       Expected frequencies E ≥ 5 in at least 80% of cells

•       Independent observations

•       Large sample sizes recommended

Procedure

•       Step 1: State H₀ (fit expected / no association) vs H₁

•       Step 2: Count observed frequencies O

•       Step 3: Calculate expected frequencies E

•       Step 4: Compute χ²_calc = Σ(O-E)²/E

•       Step 5: Compare χ²_calc with χ²_α(DF)

•       Step 6: If χ²_calc > χ²_table, reject H₀

Worked Example: Coin Bias Test

Coin tossed 100 times: 65 heads, 35 tails. Test fairness at α = 0.01.

Observed: O_H = 65, O_T = 35

Expected: E_H = 50, E_T = 50

χ² = (65-50)²/50 + (35-50)²/50 = 225/50 + 225/50 = 9.0

Critical: χ²₀.₀₁,₁ = 6.635

Since 9.0 > 6.635 → Reject H₀

Conclusion: Coin is biased.

Yates Continuity Correction

χ² = Σ [(|O - E| - 0.5)² / E]

Use for 1 DF when expected frequencies are small (< 10). Example: χ² = 8.41 (slightly less significant).

Common Pitfalls

•       Using chi-square with E < 5 → Violates assumptions

•       Forgetting the squared term (O-E)²

•       Confusing test with t-test (different data types!)

Alternatives

•       Fisher's Exact Test (small samples)

•       G-Test (log-likelihood ratio)

3. STUDENT'S t-TEST

Definition

The t-test compares means when sample sizes are small (n ≤ 30) and population variance is unknown. Developed by W.S. Gosset (pseudonym "Student").

Core Formulas

One-Sample t

t = (x̄ - μ) / (s / √n), DF = n - 1

Independent Two-Sample t (Equal Variance)

t = (x̄₁ - x̄₂) / (s_p √(1/n₁ + 1/n₂))

where s_p² = [(n₁-1)s₁² + (n₂-1)s₂²] / (n₁ + n₂ - 2)

Welch's t (Unequal Variance - Preferred)

t = (x̄₁ - x̄₂) / √(s₁²/n₁ + s₂²/n₂)

(Welch's DF computed via Satterthwaite approximation)

Paired t-Test

t = d̄ / (s_d / √n), where d = x₁ - x₂

Assumptions

•       Data normally distributed (or DF allow CLT)

•       Observations independent

•       Equal variances (for pooled version) → Test with F-test first

Acceptance Criterion

•       If |t_calc| > t_critical → Reject H₀

•       If |t_calc| ≤ t_critical → Accept H₀

Effect Size: Cohen's d

d = (x̄₁ - x̄₂) / s_p

•       d = 0.2 → Small effect

•       d = 0.5 → Medium effect

•       d = 0.8 → Large effect

Common Pitfalls

•       Using pooled t-test with unequal variances → Use Welch instead

•       Using independent t on paired data (violates independence)

•       Ignoring normality assumption

Non-Parametric Alternatives

•       One-sample: Wilcoxon Signed-Rank

•       Two-sample: Mann-Whitney U

•       Paired: Wilcoxon Signed-Rank

4. ONE-WAY ANOVA (Analysis of Variance)

Definition

ANOVA compares means of 3 or more groups. Developed by R.A. Fisher. It partitions total variance into between-group and within-group components.

Core Concept

SS_Total = SS_Between + SS_Within

Formulas

Between-Group Variance

SS_Between = Σ nᵢ (x̄ᵢ - x̄̄)²

Within-Group Variance

SS_Within = Σ Σ (xᵢⱼ - x̄ᵢ)²

F-Ratio

F = MS_Between / MS_Within = (SS_B/(k-1)) / (SS_W/(N-k))

where k = number of groups, N = total observations

Procedure

•       Step 1: Compute mean of each group (x̄₁, x̄₂, ..., x̄_k)

•       Step 2: Compute overall mean x̄̄

•       Step 3: Calculate SS_Between and SS_Within

•       Step 4: Compute MS values and F-ratio

•       Step 5: Compare F_calc with F_α(k-1, N-k)

•       Step 6: If F_calc > F_table, reject H₀

Worked Example: Study Methods (A, B, C)

10 students per method. Test if mean scores differ at α = 0.05.

Means: x̄_A = 8.7, x̄_B = 8.6, x̄_C = 8.5, x̄̄ = 8.6

ANOVA Table:

Source

SS

DF

MS

F

Between

0.2

2

0.1

0.096

Within

28.0

27

1.037

Total

28.2

29

 

F = 0.1 / 1.037 = 0.096 << F_0.05(2,27) ≈ 3.35

Decision: Fail to reject H₀ → No significant difference between methods.

Effect Size: Eta-Squared

η² = SS_Between / SS_Total

(Proportion of variance explained by group membership)

Post-Hoc Tests (if H₀ rejected)

•       Tukey HSD (most popular)

•       Bonferroni (conservative)

•       Scheffé (most flexible)

Assumptions

•       Normality within each group

•       Homogeneity of variances (test with Levene's)

•       Independence of observations

Common Pitfalls

•       Using ANOVA without checking homogeneity first

•       Not using post-hoc when groups differ significantly

•       Ignoring interaction effects in factorial designs

Alternatives

•       Kruskal-Wallis (non-parametric, ordinal data)

•       Welch ANOVA (unequal variances)

5. DESIGN OF EXPERIMENTS (DOE) BASICS

Purpose

Systematically vary factors to optimize process output. Common in engineering, manufacturing, agriculture.

Worked Example: Temperature × Pressure Factorial

Exp

Temperature

Pressure

Output Quality

1

Low

Low

70

2

Low

High

75

3

High

Low

80

4

High

High

90

 

Main Effects Analysis:

Temperature effect = (80+90)/2 - (70+75)/2 = 12.5

Pressure effect = (75+90)/2 - (70+80)/2 = 7.5

Best setting: High Temperature + High Pressure → Output 90

DOE Principles

•       Randomization: Reduces bias from unknown variables

•       Replication: Provides error estimates

•       Blocking: Controls nuisance factors

•       Factorial Design: Examines all factor combinations

•       Response Surface Methodology: Models continuous optimization

Common DOE Types

•       Full Factorial 2^k (all combinations)

•       Fractional Factorial (screening, fewer experiments)

•       Central Composite (curvature testing)

•       Taguchi (robust design, noise factors)

6. NON-PARAMETRIC ALTERNATIVES

When assumptions fail (non-normal, small n, ordinal data), use these:

Parametric Test

Non-Parametric Alternative

One-sample t

Wilcoxon Signed-Rank

Independent t

Mann-Whitney U

Paired t

Wilcoxon Signed-Rank

ANOVA

Kruskal-Wallis H

Correlation

Spearman Rank, Kendall τ

 

7. BEST PRACTICES & COMMON PITFALLS

Before Testing

•       ✓ Check normality (Shapiro-Wilk, Q-Q plots)

•       ✓ Check equal variance (Levene's test)

•       ✓ Verify independence

•       ✓ Plan sample size (power analysis)

While Testing

•       ✓ Use appropriate test for data type

•       ✓ Report confidence intervals (not just p-values)

•       ✓ Report effect size (Cohen's d, η², etc.)

•       ✓ Adjust for multiple comparisons (Bonferroni)

Interpretation Rules

•       p < α: Reject H₀ (statistically significant)

•       p ≥ α: Fail to reject H₀ (not significant)

•       p-value ≠ probability H₀ is true

•       Small p-value = strong evidence against H₀

Critical Pitfalls to Avoid

•       ❌ Relying only on p-values (ignoring effect size)

•       ❌ p-hacking / Multiple testing without correction

•       ❌ Using wrong test for data type

•       ❌ Assuming correlation = causation

•       ❌ Violating assumptions without sensitivity checks

8. FORMULA QUICK REFERENCE SHEET

Formulas for All Tests

Test

Formula

Critical Info

F-Test

F = S₁²/S₂²

DF = (n₁-1, n₂-1)

χ²

χ² = Σ(O-E)²/E

DF = k-1 or (r-1)(c-1)

One-Sample t

t = (x̄-μ)/(s/√n)

DF = n-1

Two-Sample t

t = (x̄₁-x̄₂)/(s_p√(1/n₁+1/n₂))

DF = n₁+n₂-2

ANOVA

F = MS_B/MS_W

DF = (k-1, N-k)

Cohen's d

d = (x̄₁-x̄₂)/s_p

0.2=small, 0.5=med, 0.8=large

 

Final Note for Exam Success

Remember: Each test answers a specific question about your data. Always:

•       Understand the question (what are you comparing?)

•       Check assumptions first

•       Choose the right test

•       Report effect size + confidence interval, not just p-value

•       Interpret in context (statistical significance ≠ practical significance)

Good luck with your M.Tech exams and viva! 🎓

No comments:

Post a Comment

CIL syllabus

"CIL MT Surgical Syllabus Operation 2.0". 🏥 CIL MT Surgical Operation 2.0 Diagnosis │ ▼ Exam Analysis │ ▼ Foundation Buildi...