Slide 1 — Statistical inference about the difference of population means（第1页——总体均值差的统计推断）

Knowledge Points (知识点)

Inference for the difference of two population means（两个总体均值差的统计推断）
Case 1: population standard deviations σ₁ and σ₂ known（情形一：已知总体标准差 σ₁、σ₂）
Case 2: population standard deviations σ₁ and σ₂ unknown（情形二：未知总体标准差 σ₁、σ₂）

Explanation（解释）

We study how to use two independent samples to make inferences about the difference in population means $μ_{1} - μ_{2}$ .
There are two main scenarios:
- When both population standard deviations $σ_{1}$ and $σ_{2}$ are known, we use the z-distribution.
- When at least one of $σ_{1}, σ_{2}$ is unknown, we estimate them using sample standard deviations and use t-based methods.
The goal is to construct confidence intervals and perform hypothesis tests about $μ_{1} - μ_{2}$ .

📖 点击查看中文解释

本章讨论如何利用两个独立样本来推断两个总体均值之差 $μ_{1} - μ_{2}$ 。

主要分为两种情形：

当总体标准差 $σ_{1}$ 与 $σ_{2}$ 已知时，采用 z 分布 进行推断；

当 $σ_{1}$ 或 $σ_{2}$ 至少有一个未知时，用样本标准差估计，并使用 t 分布 方法。

我们的目标是对 $μ_{1} - μ_{2}$ 构造置信区间并进行假设检验。

Example（例子）

Suppose we compare average spending of Group 1 (students who mainly shop online) and Group 2 (students who mainly shop offline).
We want to know whether the population means differ, that is, whether $μ_{1} - μ_{2} \neq = 0$ .
Depending on whether $σ_{1}, σ_{2}$ are known or unknown, we choose different formulas and distributions, but the target parameter is always $μ_{1} - μ_{2}$ .

📖 点击查看中文解释

例如比较两组学生的平均消费水平：第一组主要线上消费（总体 1），第二组主要线下消费（总体 2）。

我们关心总体均值是否不同，即 $μ_{1} - μ_{2} \neq = 0$ 。

根据 $σ_{1}, σ_{2}$ 是否已知，选择不同的公式和分布，但关注的参数始终是 $μ_{1} - μ_{2}$ 。

Extension（拓展）

This framework applies to many business problems: comparing two products, two branches, or two marketing strategies.
Later sections will also discuss assumptions such as independence, normality, and large-sample approximations that justify using z or t procedures.

📖 点击查看中文解释

该框架可用于大量商业问题：比较两种产品、两家门店或两种营销策略的平均效果。

后续内容还会讨论使用 z 或 t 方法所需的前提假设，如样本独立性、总体近似正态以及大样本近似等。

Summary（小结）

This chapter introduces two-sample inference for the difference in means, with separate methods for known and unknown population standard deviations.

📖 点击查看中文解释

本章介绍针对总体均值差的两样本推断，并根据总体标准差已知或未知采用不同的方法。

Slide 2 — Notation for two-population mean comparison（第2页——两个总体均值比较的符号约定）

Knowledge Points (知识点)

Population parameters $μ_{1}, μ_{2}, σ_{1}, σ_{2}$ （总体参数：均值与标准差）
Sample statistics $\overset{ˉ}{X}_{1}, \overset{ˉ}{X}_{2}, n_{1}, n_{2}$ （样本统计量：均值与样本量）
Difference of population means $μ_{1} - μ_{2}$ and difference of sample means $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ （总体均值差与样本均值差）

Explanation（解释）

Population 1 has mean $μ_{1}$ and standard deviation $σ_{1}$ ; Population 2 has mean $μ_{2}$ and standard deviation $σ_{2}$ .
We draw independent samples:
- Sample 1: size $n_{1}$ , sample mean $\overset{ˉ}{X}_{1}$ from Population 1.
- Sample 2: size $n_{2}$ , sample mean $\overset{ˉ}{X}_{2}$ from Population 2.
The parameter of interest is the difference in population means $μ_{1} - μ_{2}$ .
The point estimator of $μ_{1} - μ_{2}$ is the difference in sample means $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ .

📖 点击查看中文解释

总体 1 的均值为 $μ_{1}$ ，标准差为 $σ_{1}$ ；总体 2 的均值为 $μ_{2}$ ，标准差为 $σ_{2}$ 。

我们各自从两个总体中抽取独立样本：

样本 1：容量 $n_{1}$ ，样本均值 $\overset{ˉ}{X}_{1}$ ，来自总体 1；

样本 2：容量 $n_{2}$ ，样本均值 $\overset{ˉ}{X}_{2}$ ，来自总体 2。

我们关注的总体参数是总体均值差 $μ_{1} - μ_{2}$ 。

其点估计量是样本均值差 $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ 。

Example（例子）

Population 1: monthly online spending of all business students; Population 2: monthly online spending of all non-business students.
We sample $n_{1} = 40$ business students and $n_{2} = 35$ non-business students and compute their sample means $\overset{ˉ}{X}_{1}, \overset{ˉ}{X}_{2}$ .
We will use $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ to estimate $μ_{1} - μ_{2}$ .

📖 点击查看中文解释

总体 1：所有商科学生的每月线上消费；总体 2：所有非商科学生的每月线上消费。

抽取 $n_{1} = 40$ 个商科学生和 $n_{2} = 35$ 个非商科学生，计算各自的样本均值 $\overset{ˉ}{X}_{1}, \overset{ˉ}{X}_{2}$ 。

用样本均值差 $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ 来估计总体均值差 $μ_{1} - μ_{2}$ 。

Extension（拓展）

The notation extends naturally to paired-sample designs, but here we focus on independent samples.
Clear notation helps when we later derive the sampling distribution of $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ and construct confidence intervals.

📖 点击查看中文解释

这些符号也可以扩展到配对样本的情形，但本节主要讨论独立样本。

统一的符号约定有助于后续推导 $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ 的抽样分布并构造置信区间。

Summary（小结）

We distinguish clearly between population parameters and sample statistics, and recognize $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ as the point estimator of $μ_{1} - μ_{2}$ .

📖 点击查看中文解释

本页明确区分了总体参数与样本统计量，并认识到样本均值差 $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ 是总体均值差 $μ_{1} - μ_{2}$ 的点估计量。

Slide 3 — Distribution of the difference of sample means（第3页——样本均值差的分布）

Knowledge Points (知识点)

Expected value of $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ （样本均值差的期望）
Standard deviation / standard error of $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ when $σ_{1}, σ_{2}$ are known（已知总体标准差时的标准差/标准误）
Role of sample sizes $n_{1}, n_{2}$ （样本量对变异性的影响）

Explanation（解释）

Under the usual assumptions (independent samples, each from a population with mean $μ_{i}$ and variance $σ_{i}^{2}$ ), the expected value of the difference in sample means is

E (\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}) = μ_{1} - μ_{2} .

When the population standard deviations are known, the standard deviation (also called standard error) of $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ is

σ_{\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}} = \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} .

Larger sample sizes $n_{1}, n_{2}$ make the standard error smaller, leading to more precise estimates.

📖 点击查看中文解释

在常见假设下（两个样本相互独立，各自来自均值为 $μ_{i}$ 、方差为 $σ_{i}^{2}$ 的总体），样本均值差的期望值为

E (\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}) = μ_{1} - μ_{2} .

当总体标准差已知时，样本均值差 $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ 的标准差/标准误为

σ_{\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}} = \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} .

样本量 $n_{1}, n_{2}$ 越大，标准误越小，估计就越精确。

Example（例子）

Let $σ_{1} = 10, σ_{2} = 12, n_{1} = 50, n_{2} = 60$ .
Then

σ_{\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}} = \frac{1 0 ^{2}}{50} + \frac{1 2 ^{2}}{60} = \frac{100}{50} + \frac{144}{60} = 2 + 2.4 = 4.4 \approx 2.10.

This tells us the typical sampling variation of the difference in sample means.

📖 点击查看中文解释

设 $σ_{1} = 10, σ_{2} = 12, n_{1} = 50, n_{2} = 60$ 。

则

σ_{\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}} = \frac{1 0 ^{2}}{50} + \frac{1 2 ^{2}}{60} = \frac{100}{50} + \frac{144}{60} = 2 + 2.4 = 4.4 \approx 2.10.

该值表示样本均值差在重复抽样中的典型波动大小。

Extension（拓展）

When both populations are normal or when sample sizes are large, the distribution of $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ is approximately normal with mean $μ_{1} - μ_{2}$ and standard deviation $σ_{\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}}$ .
This normality justifies using z-based confidence intervals and hypothesis tests in the “σ known” case.

📖 点击查看中文解释

当两个总体为正态分布，或样本量足够大时， $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ 近似服从均值为 $μ_{1} - μ_{2}$ 、标准差为 $σ_{\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}}$ 的正态分布。

这种正态性为“已知 σ” 情形下使用 z 置信区间与 z 检验提供了理论依据。

Summary（小结）

The difference in sample means is an unbiased estimator of the difference in population means, and its standard error depends on both population variances and sample sizes.

📖 点击查看中文解释

样本均值差是总体均值差的无偏估计量，其标准误由两总体的方差和样本量共同决定。

Slide 4 — Confidence interval for $μ_{1} - μ_{2}$ when σ₁, σ₂ are known（第4页——已知 σ₁、σ₂ 时的均值差置信区间）

Knowledge Points (知识点)

Point estimate $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ （总体均值差的点估计）
$100 (1 - α) %$ confidence interval formula with known σ₁, σ₂（已知总体标准差时的置信区间公式）
Meaning of significance level $α$ in a two-tailed interval（双侧置信区间中的显著性水平）

Explanation（解释）

The point estimate of the difference in population means is

\overset{μ}{^}_{1} - \overset{μ}{^}_{2} = \overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2} .

When $σ_{1}, σ_{2}$ are known and the sampling distribution of $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ is normal, a $100 (1 - α) %$ confidence interval for $μ_{1} - μ_{2}$ is

(\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}) \pm z_{α /2} \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} .

Here $z_{α /2}$ is the critical value from the standard normal distribution such that the two tails together have area $α$ .

📖 点击查看中文解释

总体均值差的点估计为

\overset{μ}{^}_{1} - \overset{μ}{^}_{2} = \overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2} .

当 $σ_{1}, σ_{2}$ 已知且 $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ 近似正态时， $100 (1 - α) %$ 的置信区间为

(\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}) \pm z_{α /2} \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} .

其中 $z_{α /2}$ 是标准正态分布的临界值，使得两侧尾部的总面积为 $α$ 。

Example（例子）

Suppose $\overset{ˉ}{X}_{1} = 120, \overset{ˉ}{X}_{2} = 110, σ_{1} = 15, σ_{2} = 20, n_{1} = 50, n_{2} = 60$ .
For a 95% confidence interval, $α = 0.05$ and $z_{α /2} = z_{0.025} \approx 1.96$ .
The standard error is

\frac{1 5 ^{2}}{50} + \frac{2 0 ^{2}}{60} = \frac{225}{50} + \frac{400}{60} = 4.5 + 6.67 \approx 11.17 \approx 3.34.

The interval is

(120 - 110) \pm 1.96 \times 3.34 = 10 \pm 6.54 = (3.46, 16.54) .

We are 95% confident that $μ_{1} - μ_{2}$ lies between 3.46 and 16.54.

📖 点击查看中文解释

设 $\overset{ˉ}{X}_{1} = 120, \overset{ˉ}{X}_{2} = 110, σ_{1} = 15, σ_{2} = 20, n_{1} = 50, n_{2} = 60$ 。

对于 95% 置信区间， $α = 0.05$ ， $z_{α /2} = z_{0.025} \approx 1.96$ 。

标准误为

\frac{1 5 ^{2}}{50} + \frac{2 0 ^{2}}{60} \approx 3.34.

置信区间为

10 \pm 1.96 \times 3.34 = (3.46, 16.54) .

我们有 95% 的把握认为，总体均值差 $μ_{1} - μ_{2}$ 介于 3.46 与 16.54 之间。

Extension（拓展）

The same formula can be adapted for one-sided intervals by using $z_{α}$ instead of $z_{α /2}$ .
The structure parallels the one-sample z-interval, but now the standard error combines the variability from both populations.

📖 点击查看中文解释

若构造单侧置信区间，只需将临界值改为 $z_{α}$ 而非 $z_{α /2}$ 。

该公式的结构与单总体 z 置信区间类似，只是标准误中合并了两个总体的变异性。

Summary（小结）

With known population standard deviations, the confidence interval for $μ_{1} - μ_{2}$ is built from the point estimate $\overset{ˉ}{X}_{1} - \overset{ˉ}{X}_{2}$ plus/minus a z-critical value times the combined standard error.

📖 点击查看中文解释

当总体标准差已知时，均值差的置信区间由“样本均值差 ± z 临界值 × 合并标准误”构成，是两总体均值比较的基本工具。

Slide 5 — Confidence interval for μ₁ − μ₂（第5页——均值差置信区间）

Knowledge Points (知识点)

Point estimate $\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}$ for $μ_{1} - μ_{2}$ （总体均值差的点估计）
Confidence interval formula with known $σ_{1}, σ_{2}$ （已知总体标准差时的置信区间公式）
Significance level $α$ and two-tailed intervals（显著性水平与双侧区间）

Explanation（解释）

The point estimate of the difference in population means is

\overset{μ}{^}_{1} - \overset{μ}{^}_{2} = \overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2} .

When population standard deviations $σ_{1}, σ_{2}$ are known and samples are independent, a $100 (1 - α) %$ confidence interval for $μ_{1} - μ_{2}$ is

(\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}) \pm z_{α /2} \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} .

Here $z_{α /2}$ is the critical value from the standard normal distribution such that each tail has probability $α /2$ .
$α$ is the significance level for a two-tailed interval: it is the total probability outside the interval.

📖 点击查看中文解释

总体均值差 $μ_{1} - μ_{2}$ 的点估计为

\overset{μ}{^}_{1} - \overset{μ}{^}_{2} = \overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2} .

当总体标准差 $σ_{1}, σ_{2}$ 已知且样本相互独立时， $100 (1 - α) %$ 的置信区间为

(\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}) \pm z_{α /2} \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} .

其中 $z_{α /2}$ 是标准正态分布的临界值，使得每个尾部的概率为 $α /2$ 。

$α$ 是双侧置信区间的显著性水平，表示区间之外的总概率。

Example（例子）

For a 95% confidence interval, we set $α = 0.05$ .
The corresponding critical value is $z_{α /2} = z_{0.025} \approx 1.96$ .
Any confidence interval using 95% confidence will have the form

(\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}) \pm 1.96 \times (standard error) .

📖 点击查看中文解释

若需要 95% 置信区间，则取 $α = 0.05$ 。

此时临界值为 $z_{α /2} = z_{0.025} \approx 1.96$ 。

任意 95% 置信区间都可以写成

(\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}) \pm 1.96 \times 标准误 .

Extension（拓展）

If the hypothesized difference (often $0$ ) lies outside the confidence interval, we will later reject the null hypothesis in a two-tailed test.
Thus confidence intervals and hypothesis tests are closely connected.

📖 点击查看中文解释

若某个假设的均值差（通常为 $0$ ）落在置信区间之外，那么在后面进行的双侧假设检验中，我们会拒绝原假设。

因此，置信区间与假设检验之间有紧密联系。

Summary（小结）

With known $σ_{1}, σ_{2}$ , the confidence interval for $μ_{1} - μ_{2}$ is constructed by taking the point estimate $\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}$ plus/minus a z-critical value times the combined standard error.

📖 点击查看中文解释

当总体标准差已知时，均值差的置信区间由“样本均值差 ± z 临界值 × 合并标准误”构成，是比较两个总体均值的基本工具。

Slide 6 — Example setup: ABC company vs competitor（第6页——示例设定：ABC 公司与竞争对手）

Knowledge Points (知识点)

Comparing two population means using sample information（用样本信息比较两个总体均值）
Identifying $n_{1}, n_{2}, \overset{x}{ˉ}_{1}, \overset{x}{ˉ}_{2}, σ_{1}, σ_{2}$ from a table（从表格中识别参数）
Significance level $α = 0.05$ in practice（实际问题中的显著性水平）

Explanation（解释）

The ABC company compares the average life of its own product (Sample 1) with that of a competitor (Sample 2).
Both samples are independent, and population standard deviations are assumed known.

📖 点击查看中文解释

ABC 公司想比较自己产品（样本 1）与竞争对手产品（样本 2）的平均寿命。

两个样本相互独立，并且假设总体标准差已知。

Example（例子）

Data table（数据表）

	Sample 1 (ABC)	Sample 2 (Competitor)
Sample size $n$	120 units	80 units
Sample mean $\overset{x}{ˉ}$	275 min	258 min
Standard deviation $σ$	15 min	20 min

Significance level: $α = 0.05$ .

📖 点击查看中文解释

样本 1（ABC 产品）：

样本量 $n_{1} = 120$ ，样本均值 $\overset{x}{ˉ}_{1} = 275$ 分钟，标准差 $σ_{1} = 15$ 分钟；

样本 2（竞争对手）：

样本量 $n_{2} = 80$ ，样本均值 $\overset{x}{ˉ}_{2} = 258$ 分钟，标准差 $σ_{2} = 20$ 分钟；

显著性水平： $α = 0.05$ 。

Extension（拓展）

The question “Is there any difference?” corresponds to testing whether

μ_{1} - μ_{2} = 0

or estimating a confidence interval to see if $0$ is included.

📖 点击查看中文解释

“是否有差异？”这一问题在统计上对应于检验

μ_{1} - μ_{2} = 0

或者构造置信区间，看 $0$ 是否落在区间之内。

Summary（小结）

This example provides realistic sample data and a chosen significance level, allowing us to compute a confidence interval for $μ_{1} - μ_{2}$ and judge whether ABC’s product differs from its competitor’s.

📖 点击查看中文解释

本例给出了具体的样本数据和显著性水平，为我们计算 $μ_{1} - μ_{2}$ 的置信区间、判断 ABC 产品是否优于竞争对手提供了基础。

Slide 7 — Example calculation: confidence interval and conclusion（第7页——示例计算：置信区间与结论）

Knowledge Points (知识点)

Computing the standard error of $\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}$ （计算样本均值差的标准误）
Finding the confidence interval numerically（数值上求出置信区间）
Using the interval to judge significance（利用置信区间判断显著性）

Explanation（解释）

From the ABC example, the point estimate is

\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2} = 275 - 258 = 17 min .

The standard error of $\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}$ is

SE = \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} = \frac{1 5 ^{2}}{120} + \frac{2 0 ^{2}}{80} \approx 2.62.

With $α = 0.05$ , the critical value is $z_{0.025} = 1.96$ .
The confidence interval is

17 \pm 1.96 \times 2.62 = 17 \pm 5.14 = (11.86, 22.14) .

📖 点击查看中文解释

对于 ABC 例子，均值差的点估计为

\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2} = 275 - 258 = 17 分钟 .

样本均值差的标准误为

SE = \frac{1 5 ^{2}}{120} + \frac{2 0 ^{2}}{80} \approx 2.62.

显著性水平 $α = 0.05$ 时，临界值 $z_{0.025} = 1.96$ 。

置信区间为

17 \pm 1.96 \times 2.62 = 17 \pm 5.14 = (11.86, 22.14) .

Example（例子）：Interpretation（区间解释）

Because the entire interval $(11.86, 22.14)$ is above 0, we can say:
- At the 5% significance level, there is significant evidence that ABC’s product has a larger mean life than its competitor’s.
- The estimated difference in mean life is between about 12 and 22 minutes.

📖 点击查看中文解释

因为区间 $(11.86, 22.14)$ 整体大于 0，所以在 5% 的显著性水平下：

有显著证据表明 ABC 产品的平均寿命高于竞争对手；

平均寿命的差异估计在约 12 到 22 分钟之间。

Extension（拓展）

If 0 had been contained in the interval, we would conclude that the data are consistent with no difference in mean lifetimes.
This logic is equivalent to performing a two-tailed hypothesis test with null hypothesis $μ_{1} - μ_{2} = 0$ .

📖 点击查看中文解释

如果 0 落在置信区间内，我们会认为数据与“平均寿命无差异”的假设一致。

这种判断与对原假设 $μ_{1} - μ_{2} = 0$ 进行双侧假设检验的结论是等价的。

Summary（小结）

For the ABC example, the 95% confidence interval shows a positive difference far from 0, indicating that ABC’s product performs significantly better than the competitor’s in terms of mean life.

📖 点击查看中文解释

在 ABC 示例中，95% 置信区间完全为正且远离 0，说明 ABC 产品在平均寿命上显著优于竞争对手。

Slide 8 — Hypothesis tests about μ₁ − μ₂ with known σ₁, σ₂（第8页——已知 σ₁、σ₂ 时的均值差假设检验）

Knowledge Points (知识点)

Null and alternative hypotheses for comparing two means（比较两个均值的原假设与备择假设）
Three types of tests: left-tailed, right-tailed, two-tailed（三种检验形式：左尾、右尾、双尾）
z test statistic for $μ_{1} - μ_{2}$ when σ’s are known（已知总体标准差时的 z 检验统计量）

Explanation（解释）

Hypotheses（假设形式）

Left-tailed test (testing if population 1 mean is smaller):

H_{0} : μ_{1} - μ_{2} \geq D_{0}, H_{a} : μ_{1} - μ_{2} < D_{0} .

Right-tailed test (testing if population 1 mean is larger):

H_{0} : μ_{1} - μ_{2} \leq D_{0}, H_{a} : μ_{1} - μ_{2} > D_{0} .

Two-tailed test (testing if there is any difference):

H_{0} : μ_{1} - μ_{2} = D_{0}, H_{a} : μ_{1} - μ_{2} \neq = D_{0} .

Test statistic (known $σ_{1}, σ_{2}$ ):

z = \frac{( x ˉ _{1} - x ˉ _{2} ) - D _{0}}{\frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}}} .

📖 点击查看中文解释

左尾检验（检验总体 1 均值是否更小）：

H_{0} : μ_{1} - μ_{2} \geq D_{0}, H_{a} : μ_{1} - μ_{2} < D_{0} .

右尾检验（检验总体 1 均值是否更大）：

H_{0} : μ_{1} - μ_{2} \leq D_{0}, H_{a} : μ_{1} - μ_{2} > D_{0} .

双尾检验（检验是否存在差异）：

H_{0} : μ_{1} - μ_{2} = D_{0}, H_{a} : μ_{1} - μ_{2} \neq = D_{0} .

当 $σ_{1}, σ_{2}$ 已知时，检验统计量为

z = \frac{( x ˉ _{1} - x ˉ _{2} ) - D _{0}}{\frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}}} .

Example（例子）

For the ABC company, to test whether there is any difference, we set $D_{0} = 0$ and use the two-tailed hypotheses:

H_{0} : μ_{1} - μ_{2} = 0, H_{a} : μ_{1} - μ_{2} \neq = 0.

The test statistic becomes

z = \frac{( x ˉ _{1} - x ˉ _{2} ) - 0}{SE} = \frac{17}{2.62} \approx 6.49,

which is far into the rejection region for $α = 0.05$ .

📖 点击查看中文解释

对 ABC 公司例子，若要检验“是否存在差异”，取 $D_{0} = 0$ ，建立双尾假设：

H_{0} : μ_{1} - μ_{2} = 0, H_{a} : μ_{1} - μ_{2} \neq = 0.

检验统计量为

z = \frac{17}{2.62} \approx 6.49,

远大于 $α = 0.05$ 时的临界值 $1.96$ ，因此拒绝原假设。

Extension（拓展）

Decision rules:
- Left-tailed: reject $H_{0}$ if $z < - z_{α}$ .
- Right-tailed: reject $H_{0}$ if $z > z_{α}$ .
- Two-tailed: reject $H_{0}$ if $∣ z ∣ > z_{α /2}$ .
p-value methods lead to equivalent conclusions.

📖 点击查看中文解释

决策规则：

左尾检验：若 $z < - z_{α}$ ，则拒绝 $H_{0}$ ；

右尾检验：若 $z > z_{α}$ ，则拒绝 $H_{0}$ ；

双尾检验：若 $∣ z ∣ > z_{α /2}$ ，则拒绝 $H_{0}$ 。

使用 p 值方法也会得到等价的结论。

Summary（小结）

Hypothesis tests about $μ_{1} - μ_{2}$ specify a null value $D_{0}$ , choose the appropriate tail form, and use the z statistic based on the combined standard error when $σ_{1}, σ_{2}$ are known.

📖 点击查看中文解释

针对 $μ_{1} - μ_{2}$ 的假设检验，需要给定假设差值 $D_{0}$ ，确定是左尾、右尾还是双尾检验，并在已知标准差时使用基于合并标准误的 z 统计量进行决策。

Slide 9 — One-sided test: ABC product vs competitor (p-value)（第9页——单侧检验：ABC 产品与竞争对手（p 值法））

Knowledge Points (知识点)

Right-tailed test for difference of means（均值差的右尾检验）
p-value vs. significance level $α$ （p 值与显著性水平的比较）
Interpretation: “significantly higher” vs “not higher”（“显著更高”的解释）

Explanation（解释）

Question: Is ABC’s mean product life higher than the competitor’s?
We use a right-tailed test for the difference of two population means.

Step 1: Hypotheses

H_{0} : μ_{1} - μ_{2} \leq 0 vs. H_{a} : μ_{1} - μ_{2} > 0

$μ_{1}$ : mean life of ABC’s product
$μ_{2}$ : mean life of competitor’s product

Step 2: Significance level

α = 0.01

Step 3: Test statistic and p-value

z = \frac{( x ˉ _{1} - x ˉ _{2} ) - 0}{\frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}}} = \frac{( 275 - 258 ) - 0}{\frac{1 5 ^{2}}{120} + \frac{2 0 ^{2}}{80}} \approx 6.49

For $z = 6.49$ , the p-value < 0.001 (very close to 0).

Decision rule (right-tailed):
If $p-value < α$ , reject $H_{0}$ .

Since $p-value < 0.001 < 0.01$ , we reject $H_{0}$ and support $H_{a}$ .

📖 点击查看中文解释

问题：ABC 产品的平均寿命是否高于竞争对手？

使用右尾检验比较两个总体均值。
步骤 1：提出假设

H_{0} : μ_{1} - μ_{2} \leq 0, H_{a} : μ_{1} - μ_{2} > 0

其中 $μ_{1}$ 为 ABC 产品的总体平均寿命， $μ_{2}$ 为竞争对手产品的总体平均寿命。
步骤 2：显著性水平

α = 0.01

步骤 3：计算检验统计量与 p 值

z = \frac{( 275 - 258 ) - 0}{\frac{1 5 ^{2}}{120} + \frac{2 0 ^{2}}{80}} \approx 6.49

对应的 p 值 < 0.001，远小于 0.01。

决策规则（右尾）：若 $p-value < α$ ，则拒绝 $H_{0}$ 。

因为 $p-value < 0.001 < 0.01$ ，所以拒绝 $H_{0}$ ，支持 $H_{a}$ 。

Example（例子）结论

We conclude that ABC’s mean life is significantly higher than the competitor’s at the 1% significance level.

📖 点击查看中文解释

结论：在 1% 的显著性水平下，ABC 产品的平均寿命显著高于竞争对手产品。

Extension（拓展）

The p-value approach does not require computing the critical value.
It directly tells how extreme the observed $z$ is under $H_{0}$ .

📖 点击查看中文解释

p 值方法不需要先求临界值，而是直接衡量在原假设成立时观测到当前 $z$ 值“有多极端”。

p 值越小，反对 $H_{0}$ 的证据越强。

Summary（小结）

For a one-sided right-tailed test, if the computed p-value is smaller than $α$ , we conclude that population 1’s mean is significantly larger than population 2’s mean.

📖 点击查看中文解释

在单侧右尾检验中，若 p 值小于显著性水平 $α$ ，则说明总体 1 的均值显著大于总体 2 的均值。

Slide 10 — One-sided test: ABC vs competitor (critical value)（第10页——单侧检验：ABC 与竞争对手（临界值法））

Knowledge Points (知识点)

z-critical value $z_{α}$ for right-tailed test（右尾检验的临界值 $z_{α}$ ）
Compare test statistic $z$ with $z_{α}$ （用 $z$ 与 $z_{α}$ 比较做决策）
Connection to p-value approach（与 p 值法的一致性）

Explanation（解释）

Same hypotheses and data as Slide 9:

H_{0} : μ_{1} - μ_{2} \leq 0, H_{a} : μ_{1} - μ_{2} > 0, α = 0.01

Test statistic is still $z \approx 6.49$ .

Critical value for right-tailed test

z_{α} = z_{0.01} \approx 2.33

Decision rule (right-tailed):
Reject $H_{0}$ if $z > z_{α}$ .

Since

z \approx 6.49 > 2.33 = z_{α},

we reject $H_{0}$ and conclude that ABC’s mean is significantly higher.

📖 点击查看中文解释

与第 9 页相同的假设与数据：

H_{0} : μ_{1} - μ_{2} \leq 0, H_{a} : μ_{1} - μ_{2} > 0, α = 0.01

检验统计量仍为 $z \approx 6.49$ 。
右尾检验的临界值

z_{α} = z_{0.01} \approx 2.33

决策规则：若 $z > z_{α}$ ，则拒绝 $H_{0}$ 。

因为

z \approx 6.49 > 2.33 = z_{α},

所以拒绝 $H_{0}$ ，认为 ABC 产品的平均寿命显著更高。

Example（例子）比较两种方法

p-value approach (Slide 9): compare p-value with $α$ .
Critical value approach (this slide): compare $z$ with $z_{α}$ .
Both give the same conclusion.

📖 点击查看中文解释

p 值法（第 9 页）：比较 p 值与 $α$ 。

临界值法（本页）：比较 $z$ 与 $z_{α}$ 。

两种方法得到的结论完全一致。

Extension（拓展）

For a left-tailed test, we would use $z_{1 - α} = - z_{α}$ .
For a two-tailed test, we use $\pm z_{α /2}$ as critical values.

📖 点击查看中文解释

左尾检验中，临界值为 $z_{1 - α} = - z_{α}$ 。

双尾检验中，临界值为 $\pm z_{α /2}$ 。

Summary（小结）

In the critical-value approach, if the standardized test statistic lies in the rejection region (beyond the critical value), we reject $H_{0}$ ; otherwise, we fail to reject $H_{0}$ .

📖 点击查看中文解释

在临界值法中，若标准化统计量落入拒绝域（超过临界值），就拒绝 $H_{0}$ ；否则就“不拒绝” $H_{0}$ 。

Slide 11 — Practice: TOEFL scores of two universities（第11页——练习：两所大学托福成绩比较）

Knowledge Points (知识点)

Two-sample z test for difference in mean scores（两总体均值差的 z 检验）
Setting up hypotheses for “significant difference”（“是否有显著差异”的假设设定）
Interpreting test results in context（在情境中解释统计结论）

Explanation（解释）

We compare TOEFL scores of Newland University and ABC University.

Data table（数据表）

Group	Score ( $\overset{x}{ˉ}$ )	Standard deviation ( $σ$ )	Sample size ( $n$ )
Newland University	103	15	50
ABC University	96	10	50

Significance level: $α = 0.05$ .

Step 1: Hypotheses

“Is there significant difference?” → two-tailed test

H_{0} : μ_{1} - μ_{2} = 0 vs. H_{a} : μ_{1} - μ_{2} \neq = 0

where

$μ_{1}$ : mean TOEFL score of Newland students,
$μ_{2}$ : mean TOEFL score of ABC students.

Step 2: Test statistic

Point estimate:

\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2} = 103 - 96 = 7

Standard error:

SE = \frac{σ _{1}^{2}}{n _{1}} + \frac{σ _{2}^{2}}{n _{2}} = \frac{1 5 ^{2}}{50} + \frac{1 0 ^{2}}{50} = \frac{225}{50} + \frac{100}{50} = 6.5 \approx 2.55

z-score:

z = \frac{( x ˉ _{1} - x ˉ _{2} ) - 0}{SE} = \frac{7}{2.55} \approx 2.75

Step 3: Decision

For a two-tailed test with $α = 0.05$ , critical values:

\pm z_{0.025} = \pm 1.96.

Since $∣ z ∣ \approx 2.75 > 1.96$ , we reject $H_{0}$ .

📖 点击查看中文解释

比较 Newland University 与 ABC University 学生的托福成绩。

数据见上表，显著性水平 $α = 0.05$ 。
步骤 1：假设（是否有差异 → 双尾检验）

H_{0} : μ_{1} - μ_{2} = 0, H_{a} : μ_{1} - μ_{2} \neq = 0

$μ_{1}$ ：Newland 学生托福平均分； $μ_{2}$ ：ABC 学生托福平均分。
步骤 2：检验统计量

\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2} = 103 - 96 = 7

SE = \frac{1 5 ^{2}}{50} + \frac{1 0 ^{2}}{50} = 6.5 \approx 2.55

z = \frac{7}{2.55} \approx 2.75

步骤 3：决策

双尾检验、 $α = 0.05$ 时临界值为 $\pm 1.96$ 。

因为 $∣ z ∣ \approx 2.75 > 1.96$ ，所以拒绝 $H_{0}$ 。

Example（例子）结论

There is a significant difference in mean TOEFL scores between Newland and ABC at the 5% level.
Since $\overset{x}{ˉ}_{1} = 103 > 96 = \overset{x}{ˉ}_{2}$ , Newland students have higher average TOEFL scores.

📖 点击查看中文解释

在 5% 的显著性水平下，两所大学的托福平均分存在显著差异。

且 Newland 的样本均值更高，说明 Newland 学生的托福成绩平均更高。

Extension（拓展）

We could also construct a 95% confidence interval for $μ_{1} - μ_{2}$ :

(\overset{x}{ˉ}_{1} - \overset{x}{ˉ}_{2}) \pm z_{0.025} \times SE = 7 \pm 1.96 \times 2.55 \approx 7 \pm 5.00 = (2.0, 12.0)

Because 0 is not in this interval, the confidence-interval approach gives the same conclusion as the hypothesis test.

📖 点击查看中文解释

亦可构造 95% 置信区间：

7 \pm 1.96 \times 2.55 \approx (2.0, 12.0)

由于区间不包含 0，与假设检验得到的“有显著差异”结论一致。

Summary（小结）

For the TOEFL example, we used a two-sample z test (and an equivalent confidence interval) to show that Newland University’s mean TOEFL score is significantly higher than ABC University’s at $α = 0.05$ .

📖 点击查看中文解释

托福示例通过两样本 z 检验（以及等价的置信区间）表明，在显著性水平 $α = 0.05$ 下，Newland University 学生的平均托福分数显著高于 ABC University。

Quartz 4

Explorer

相关笔记

Slide 1 — Statistical inference about the difference of population means（第1页——总体均值差的统计推断）

Knowledge Points (知识点)

Explanation（解释）

Example（例子）

Extension（拓展）

Summary（小结）

Slide 2 — Notation for two-population mean comparison（第2页——两个总体均值比较的符号约定）

Knowledge Points (知识点)

Explanation（解释）

Example（例子）

Extension（拓展）

Summary（小结）

Slide 3 — Distribution of the difference of sample means（第3页——样本均值差的分布）

Knowledge Points (知识点)

Explanation（解释）

Example（例子）

Extension（拓展）

Summary（小结）

Slide 4 — Confidence interval for μ1​−μ2​ when σ₁, σ₂ are known（第4页——已知 σ₁、σ₂ 时的均值差置信区间）

Knowledge Points (知识点)

Explanation（解释）

Example（例子）

Extension（拓展）

Summary（小结）

Slide 5 — Confidence interval for μ₁ − μ₂（第5页——均值差置信区间）

Knowledge Points (知识点)

Explanation（解释）

Example（例子）

Extension（拓展）

Summary（小结）

Slide 6 — Example setup: ABC company vs competitor（第6页——示例设定：ABC 公司与竞争对手）

Knowledge Points (知识点)

Explanation（解释）

Example（例子）

Data table（数据表）

Extension（拓展）

Summary（小结）

Slide 7 — Example calculation: confidence interval and conclusion（第7页——示例计算：置信区间与结论）

Knowledge Points (知识点)

Explanation（解释）

Example（例子）：Interpretation（区间解释）

Extension（拓展）

Summary（小结）

Slide 8 — Hypothesis tests about μ₁ − μ₂ with known σ₁, σ₂（第8页——已知 σ₁、σ₂ 时的均值差假设检验）

Knowledge Points (知识点)

Explanation（解释）

Hypotheses（假设形式）

Example（例子）

Extension（拓展）

Summary（小结）

Slide 9 — One-sided test: ABC product vs competitor (p-value)（第9页——单侧检验：ABC 产品与竞争对手（p 值法））

Knowledge Points (知识点)

Explanation（解释）

Example（例子）结论

Extension（拓展）

Summary（小结）

Slide 10 — One-sided test: ABC vs competitor (critical value)（第10页——单侧检验：ABC 与竞争对手（临界值法））

Knowledge Points (知识点)

Explanation（解释）

Example（例子）比较两种方法

Extension（拓展）

Summary（小结）

Slide 11 — Practice: TOEFL scores of two universities（第11页——练习：两所大学托福成绩比较）

Knowledge Points (知识点)

Explanation（解释）

Data table（数据表）

Step 1: Hypotheses

Step 2: Test statistic

Step 3: Decision

Example（例子）结论

Extension（拓展）

Summary（小结）

Graph View

Table of Contents

Backlinks

Slide 4 — Confidence interval for $μ_{1} - μ_{2}$ when σ₁, σ₂ are known（第4页——已知 σ₁、σ₂ 时的均值差置信区间）