Slide 1 — Overview of Hypothesis Testing(第1页——假设检验概述)

Knowledge Points (知识点)

  1. Purpose of hypothesis testing(假设检验的目的)
  2. Structure of hypotheses: null vs. alternative(假设结构:原假设与备择假设)
  3. Errors and tests for population mean with known σ(已知总体标准差时的均值检验与错误类型)

🔹Knowledge Point 1 — Purpose of hypothesis testing(假设检验的目的)

Explanation(解释)
Hypothesis testing is a procedure to decide whether a statement about a population parameter should be rejected based on sample data.
假设检验是一种统计程序,用样本数据来判断关于总体参数的某个陈述是否应该被拒绝。

Example(例子)
A company claims that the average waiting time is 10 minutes; we take a sample and test whether the true mean is still 10 minutes.
一家公司声称平均等待时间是 10 分钟,我们抽取样本并检验真实平均等待时间是否仍为 10 分钟。

Extension(拓展)
In business statistics, hypothesis tests are used for quality control, marketing surveys, and financial performance evaluation.
在商业统计中,假设检验广泛用于质量控制、市场调查以及财务绩效评估。


🔹Knowledge Point 2 — Structure of hypotheses: null vs. alternative(假设结构:原假设与备择假设)

Explanation(解释)
A hypothesis test always compares a null hypothesis (status quo) with an alternative hypothesis (research claim).
任何假设检验都比较原假设 (维持现状)与备择假设 (研究主张)。

Example(例子)
: the average exam score is 75.
: the average exam score is greater than 75.
:考试平均分为 75;
:考试平均分大于 75。

Extension(拓展)
The form of (>, <, ≠) determines whether the test is one-tailed or two-tailed and affects the rejection region.
备择假设是 “大于、小于还是不等于” 决定了检验是单尾还是双尾,并影响拒绝域的位置。


🔹Knowledge Point 3 — Errors and z-test for population mean(错误类型与总体均值 z 检验)

Explanation(解释)
When the population standard deviation is known and the population is normal or is large, we use the z-test for the mean and control Type I and Type II errors.
当总体标准差 已知且总体正态或样本量 较大时,我们使用均值的 z 检验,并关注第一类与第二类错误。

Example(例子)
We test using a sample mean and known with a z-test at significance level .
我们在显著性水平 下,用样本均值 和已知的 进行 z 检验。

Extension(拓展)
Type I error means rejecting when it is actually true; Type II error means failing to reject when is true.
第一类错误指在 真实时错误地拒绝它;第二类错误指在 真实时没有拒绝


Summary(总结)
Hypothesis testing uses sample data to compare and , control error risks, and decide whether a population statement should be rejected.
假设检验利用样本数据在 之间作出判断,在控制错误风险的前提下决定是否拒绝某个总体陈述。


Slide 2 — Null and Alternative Hypotheses(第2页——原假设与备择假设)

Knowledge Points (知识点)

  1. Why we need hypothesis testing(为什么需要假设检验)
  2. Definition of the null hypothesis (原假设 的定义)
  3. Definition of the alternative hypothesis and use of sample data(备择假设 与样本数据)

🔹Knowledge Point 1 — Why we need hypothesis testing(为什么需要假设检验)

Explanation(解释)
We cannot observe the whole population, so we use samples to decide whether a claim about a population parameter should be rejected.
我们无法观察整个总体,只能用样本数据来判断关于总体参数的说法是否应该被拒绝。

Example(例子)
A bank wants to know whether the average waiting time has increased compared with last year; a hypothesis test answers this question.
一家银行想知道平均等待时间是否相比去年增加,假设检验可以帮助回答这个问题。

Extension(拓展)
Hypothesis testing turns vague questions like “Has something changed?” into a structured decision rule based on probability.
假设检验把“是否发生了变化”这类模糊问题转化为基于概率的结构化决策规则。


🔹Knowledge Point 2 — Null hypothesis (原假设

Explanation(解释)
The null hypothesis is a tentative assumption about a population parameter, often representing “no change” or “no effect.”
原假设 是对总体参数的一种暂时性假设,通常表示“没有变化”或“没有效应”。

Example(例子)
means we temporarily assume the population mean is 50 until the data provide strong evidence against it.
表示在数据给出强烈反对证据之前,我们暂时假定总体均值为 50。

Extension(拓展)
Because represents the status quo, the burden of proof is on the data to show that should be rejected.
由于 代表现状,因此需要由数据提供足够证据,才可以拒绝


🔹Knowledge Point 3 — Alternative hypothesis and sample data(备择假设 与样本数据)

Explanation(解释)
The alternative hypothesis is the opposite of and usually expresses the researcher’s belief or claim.
备择假设 相反,通常代表研究者的主张或想要证明的结论。

Example(例子)
vs. means we want to know whether the mean has changed from 50 in either direction.
表示我们关心均值是否从 50 发生了任何方向的变化。

Extension(拓展)
We collect sample data and compute a test statistic; if the evidence strongly favors , we reject .
我们收集样本数据并计算检验统计量,如果证据强烈支持 ,就拒绝


Summary(总结)
In hypothesis testing, sets the baseline and represents the research claim; sample data decide whether the baseline should be rejected.
在假设检验中, 提供基准, 代表研究主张,而样本数据决定是否要推翻这一基准。


Slide 3 — Supporting the Alternative Hypothesis(第3页——支持备择假设)

Knowledge Points (知识点)

  1. Using data to support a research hypothesis(利用数据支持研究假设)
  2. Interpreting significant differences between sample and expected value(解释样本与期望值的显著差异)
  3. Teaching-method example with mean scores(教学方法均值比较示例)

🔹Knowledge Point 1 — Using data to support a research hypothesis(利用数据支持研究假设)

Explanation(解释)
We gather data in support of a research hypothesis expressed as the alternative hypothesis .
我们收集数据来支持以备择假设 形式提出的研究假设。

Example(例子)
A researcher believes a new marketing campaign increases average sales; states that the mean sales after the campaign are higher.
研究人员认为新的营销活动会提高平均销售额, 就表述为活动后的平均销售额更高。

Extension(拓展)
Well-designed experiments and surveys help ensure that observed differences are due to the factor in , not to random noise.
精心设计的实验与问卷能帮助保证观察到的差异主要来自 中的因素,而不是随机噪声。


🔹Knowledge Point 2 — Significant difference and rejecting (显著差异与拒绝

Explanation(解释)
If the sample result is far from what predicts, the test statistic falls in the rejection region and we conclude that is supported.
如果样本结果与 预测值相差很大,检验统计量落入拒绝域,我们就认为 得到支持。

Example(例子)
, but the sample mean is and the p-value is very small; we reject and say the mean is significantly different from 70.
下,样本均值为 且 p 值很小,我们拒绝 ,认为均值与 70 显著不同。

Extension(拓展)
“Significant” here means “unlikely under ,” not “large or important in a practical sense,” so statistical and practical significance must both be considered.
这里的“显著”指“在 为真时几乎不可能出现”,而不是“在业务上一定很重要”,因此要同时考虑统计显著性和实际意义。


🔹Knowledge Point 3 — Example: new teaching method(示例:新教学方法)

Explanation(解释)
Suppose we compare two classes: class A (old method) with mean and class B (new method) with mean .
假设我们比较两个班级:班级 A 使用旧方法,均值为 ;班级 B 使用新方法,均值为

Example(例子)
Research claim: class B performs better than class A.
;
.
研究主张:B 班成绩好于 A 班。

Extension(拓展)
If test results significantly support , the school may adopt the new method more widely; otherwise, they keep or revise the current method.
如果检验结果显著支持 ,学校可能推广新教学方法;否则可能保持或调整原有方法。


Summary(总结)
When sample evidence is very unlikely under , we reject and interpret the result as support for the research hypothesis .
当样本证据在 为真时几乎不可能出现时,我们拒绝 ,并将结果视为支持研究假设


Slide 4 — Failing to Reject the Null Hypothesis(第4页——不拒绝原假设)

Knowledge Points (知识点)

  1. Starting with a null statement about a population parameter(以总体参数的原假设为起点)
  2. Using statistical evidence to reject or not reject (用统计证据作出拒绝与否的决定)
  3. Example: comparing two class means(示例:比较两个班级的均值)

🔹Knowledge Point 1 — Starting with (以 为起点)

Explanation(解释)
We begin by stating that a specific value or relationship for a population parameter will hold under .
我们首先假设总体参数的某个具体取值或关系在 下成立。

Example(例子)
assumes the true average score is 80 before looking at the data.
表示在观察数据之前,我们假定真实平均分为 80。

Extension(拓展)
Starting from avoids bias; we only change our belief when the data give strong enough evidence.
出发可以避免偏见,只有当数据提供足够强的证据时,我们才改变原有判断。


🔹Knowledge Point 2 — Statistical evidence and decision(统计证据与决策)

Explanation(解释)
After computing the test statistic and p-value, we decide either to reject or fail to reject at a chosen significance level .
计算检验统计量和 p 值后,我们在选定的显著性水平 下决定是拒绝 还是“不拒绝 ”。

Example(例子)
If the p-value is greater than , we do not reject because the sample result is plausible under .
如果 p 值大于 ,我们就不拒绝 ,因为在 真实时,这样的样本结果是可以接受的。

Extension(拓展)
“Failing to reject ” does not prove is true; it only means there is not enough evidence against it.
“不拒绝 ” 并不等于证明 为真,只是说明反对它的证据还不够强。


🔹Knowledge Point 3 — Example: no difference vs. difference(示例:没有差异与有差异)

Explanation(解释)
We compare two classes’ mean scores and and test whether there is any difference between them.
我们比较两个班级的平均分 ,检验它们之间是否存在差异。

Example(例子)
(no difference).
(at least one mean is different).
If the test statistic is not extreme, we fail to reject and conclude there is no statistically significant difference.
(没有差异);
(至少有一个均值不同)。
如果检验统计量不极端,我们就不拒绝 ,认为没有统计上显著的差异。

Extension(拓展)
Even when we fail to find a significant difference, practical considerations (cost, time, convenience) may still guide which option is preferred.
即使没有发现显著差异,现实中的成本、时间与便利性等因素仍然会影响我们选择哪种方案。


Summary(总结)
Failing to reject means the data are consistent with the null hypothesis, so we do not have enough evidence to claim the alternative .
不拒绝 表明数据与原假设相符,我们没有足够证据去支持备择假设


Slide 5 — One- vs. Two-Tailed Tests(第5页——单尾检验与双尾检验)

Knowledge Points (知识点)

  1. One-tailed test based on null hypothesis(以原假设为起点的单尾检验)
  2. One- vs. two-tailed tests based on alternative hypothesis(以备择假设为起点的单尾与双尾检验)

🔹Knowledge Point 1 — One-tailed test based on H₀(基于 H₀ 的单尾检验)

Explanation(解释)
A one-tailed test focuses on a difference in one direction only, such as “less than or equal to” vs. “greater than.”
单尾检验只关注单一方向上的差异,比如“少于或等于”与“多于”的比较。

Example(例子)
Let bottle A and bottle B have mean sizes and .

  • One-tailed (A no larger than B):

    If data strongly suggest , we reject .
    设瓶 A 与瓶 B 的平均容量分别为
  • 单尾检验(A 不大于 B):

    若数据强烈表明 ,则拒绝

Extension(拓展)
One-tailed tests are used when only one direction of difference is meaningful for the decision (e.g., “new product is better,” not “just different”).
当决策只关心某一方向上的差异时(例如只关心“新产品是否更好”而不是“是否不同”),常使用单尾检验。

Image/Data Analysis(图像或数据分析)
在单尾检验中,拒绝域只位于分布的一端(左尾或右尾),例如 的检验,其拒绝域位于右尾,对应 显著大于 的区域。


🔹Knowledge Point 2 — One- vs. two-tailed tests based on Hₐ(基于 Hₐ 的单尾与双尾)

Explanation(解释)
The form of the alternative hypothesis determines whether the test is one-tailed ( or ) or two-tailed ().
备择假设 的形式决定检验是单尾()还是双尾()。

Example(例子)
Using bottle sizes again:

  • One-tailed (B larger than A):
  • Two-tailed (B different from A):
    继续以上瓶子例子:
  • 单尾检验(B 大于 A):
  • 双尾检验(B 与 A 不同):

Extension(拓展)
Two-tailed tests are more conservative because they detect differences in both directions and split the significance level into two tails.
双尾检验更保守,因为它要在两个方向上发现差异,需要将显著性水平 分到两个尾部。

Image/Data Analysis(图像或数据分析)
单尾检验只有一侧拒绝域(例如右尾 ),双尾检验在左右两侧各有一半拒绝域(例如 ),图形上表现为一个或两个阴影尾部区域。


Summary(总结)
Whether a test is one-tailed or two-tailed is determined by the direction stated in , but it must be consistent with how we set up .
检验是单尾还是双尾由备择假设 的方向决定,同时必须与我们设定的原假设 保持一致。


Slide 6 — Formulating Hypotheses for Mean Tests(第6页——均值检验中假设的写法)

Knowledge Points (知识点)

  1. Equality sign always in the null hypothesis(等号总写在原假设中)
  2. Meaning of as the hypothesized mean( 作为假设均值的含义)
  3. Three common forms of and (三种常见的原假设与备择假设形式)

🔹Knowledge Point 1 — Equality in (等号出现在 中)

Explanation(解释)
In hypothesis testing, the equality sign (, , ) is placed in because it represents the “no difference” or “no effect” situation.
在假设检验中,等号()写在原假设 中,因为它代表“无差异”或“无效应”的基准情形。

Example(例子)
vs. — a lower-tailed test.
例如 对比 ,这是一个左尾检验。

Extension(拓展)
Putting equality in makes it clear what exact value or boundary we are testing against, which helps define the rejection region precisely.
将等号放在 中,可以明确我们要检验的具体数值或边界,方便精确划定拒绝域。

Image/Data Analysis(图像或数据分析)
课件的三个蓝色框展示:

  • 左:
  • 中:
  • 右:
    都包含等号,形象说明“等号总在 中”。

🔹Knowledge Point 2 — Role of 的作用)

Explanation(解释)
denotes the hypothesized population mean under , usually based on historical data, industry standards, or managerial claims.
表示在原假设 下假定的总体均值,通常来自历史数据、行业标准或管理层的声明。

Example(例子)
A factory claims its average fill volume is 500 ml, so ml in .
一家工厂声称平均灌装量为 500 ml,则在 中, ml。

Extension(拓展)
Choosing carefully is crucial: if it is unrealistic, even a correct process may be wrongly judged as failing.
合理选择 很关键,如果这个值本身不合理,即使过程正常也可能被误判为不合格。

Image/Data Analysis(图像或数据分析)
图中三种检验形式都围绕 展开,对应于测试“是否显著低于、显著高于或显著不同于”该基准值。


🔹Knowledge Point 3 — Three hypothesis forms(单尾与双尾的三种形式)

Explanation(解释)
For mean tests, we commonly use:

  • Lower-tailed: ,
  • Upper-tailed: ,
  • Two-tailed: ,
    对于均值检验,常见三种写法:
  • 左尾检验:
  • 右尾检验:
  • 双尾检验:

Example(例子)
Checking if a machine underfills bottles (too low): use lower-tailed test.
Comparing new process to old with “any change”: use two-tailed test.
若要检验机器是否灌装不足(偏低),用左尾检验;若只关心“是否有变化”,用双尾检验。

Extension(拓展)
The choice among these three forms affects critical values, p-values, and the chance of detecting true differences.
选择哪一种形式会影响临界值、p 值大小以及发现真实差异的能力。

Image/Data Analysis(图像或数据分析)
三个蓝色框分别标注“one-tailed (lower)”、“one-tailed (upper)” 与 “two-tailed”,对应三种不同的拒绝域:左尾、右尾和双尾。


Summary(总结)
Formulating and correctly—putting equality in and using as the benchmark—is the first key step of any mean test.
在均值检验中,正确写出以 为基准、等号放在 中的假设形式,是进行任何检验的第一关键步。


Slide 7 — Type I and Type II Errors(第7页——第一类与第二类错误)

Knowledge Points (知识点)

  1. Why errors occur in hypothesis testing(假设检验中为什么会出错)
  2. Definition of Type I and Type II errors(第一类与第二类错误的定义)
  3. Decision table for and (原假设与备择假设的决策表)

🔹Knowledge Point 1 — Why errors occur(为什么会出错)

Explanation(解释)
Because hypothesis tests rely on sample data, limited sample size and possible biases mean our decisions can be wrong even when we follow the correct procedure.
由于假设检验依赖样本数据,样本有限且可能存在偏差,即使方法正确,我们的结论也可能出错。

Example(例子)
Testing a mean using only 20 observations may by chance give an unusually high or low sample mean.
比如只用 20 个样本来检验均值,样本均值有可能偶然偏高或偏低。

Extension(拓展)
Understanding error types helps managers interpret results carefully rather than treating “reject” or “do not reject” as absolute truth.
理解错误类型可以帮助管理者谨慎解读检验结果,而不是把“拒绝/不拒绝”当成绝对真理。

Image/Data Analysis(图像或数据分析)
课件表格上方说明:使用样本预测总体时会因为“有限样本”和“偏差”带来错误,这正是表格中两种错误的来源。


🔹Knowledge Point 2 — Type I vs. Type II errors(第一类错误与第二类错误)

Explanation(解释)

  • Type I error: reject when is actually true.
  • Type II error: fail to reject when is actually true.
    第一类错误:在 真实时错误地拒绝
    第二类错误:在 真实时却没有拒绝

Example(例子)
Quality control:

  • Type I: concluding a good batch is defective (unnecessary scrap).
  • Type II: accepting a bad batch as good (defective products reach customers).
    质量控制中:
  • 第一类错误:把合格批次当成不合格(造成不必要的报废);
  • 第二类错误:把不合格批次当成合格(缺陷产品流向客户)。

Extension(拓展)
Significance level is the probability of Type I error; increasing sample size can often reduce Type II error (denoted ).
显著性水平 就是第一类错误的概率;增加样本量通常可以降低第二类错误概率(记为 )。

Image/Data Analysis(图像或数据分析)
表格中:

  • 为真且“Do not reject ”时是正确决策;若“Reject ”则是 Type I error。
  • 为假且“Reject ”时是正确决策;若“Do not reject ”则是 Type II error。
    表格直观展示了“总体真实状态”与“我们的结论”组合后四种可能结果。

🔹Knowledge Point 3 — Decision outcomes table(决策结果表)

Explanation(解释)
The 2×2 decision table summarizes all outcomes: correct decisions in two cells and two error types in the other cells.
2×2 决策表总结了所有可能结果:两个格子是正确决策,另外两个格子分别对应两类错误。

Example(例子)
行代表结论(Reject / Do not reject ),列代表真实情况( true / false)。
The highlighted cells show where Type I and Type II errors occur.
行表示我们的结论,列表示真实情况,高亮格子正是发生第一类和第二类错误的位置。

Extension(拓展)
In practice, we often trade off between and depending on which type of error is more costly in a given context.
在实际应用中,会根据哪一类错误代价更高,在 之间做权衡。

Image/Data Analysis(图像或数据分析)
表格用黄色标出“Type I error”和“Type II error”,蓝色背景表示正确决策,通过颜色区分帮助学生记忆和理解。


Summary(总结)
Type I and Type II errors arise because we use limited samples; understanding their meaning helps us choose appropriate significance levels and sample sizes.
第一类与第二类错误源于样本有限,理解它们的含义有助于我们选择合适的显著性水平和样本量。


Slide 8 — Hypothesis Testing Using p-Value(第8页——利用 p 值进行假设检验)

Knowledge Points (知识点)

  1. Definition and interpretation of p-value(p 值的定义与含义)
  2. Decision rule with significance level (结合显著性水平的决策规则)
  3. Strength of evidence based on p-value ranges(不同 p 值区间对应的证据强弱)

🔹Knowledge Point 1 — What is a p-value?(什么是 p 值?)

Explanation(解释)
The p-value is the probability, assuming is true, of obtaining a test statistic at least as extreme as the one observed in the sample.
p 值是在原假设 为真的前提下,得到“像样本中观察到的那样极端或更极端”的检验统计量的概率。

Example(例子)
If a test for yields , there is a 3% chance of seeing such an extreme sample result when the true mean is 100.
如果对 的检验得到 ,表示在真实均值为 100 时,观察到这样极端样本结果的概率只有 3%。

Extension(拓展)
A small p-value suggests the sample result is unlikely under , providing evidence in favor of ; a large p-value suggests data are consistent with .
较小的 p 值说明在 为真时出现此结果的概率很低,从而支持 ;较大的 p 值说明数据与 一致。

Image/Data Analysis(图像或数据分析)
p 值可以视为分布曲线中“比观察值更极端”的尾部区域面积,尾部面积越小,说明样本结果越不支持


🔹Knowledge Point 2 — Decision rule with (结合显著性水平的决策规则)

Explanation(解释)
If (e.g., ), we reject and support ; if , we fail to reject .
(如 ),则拒绝 、支持 ;若 ,则不拒绝

Example(例子)
With and , we reject ; with , we do not reject .
时,结论是拒绝 ;若 ,则不拒绝

Extension(拓展)
Common choices of are 0.01, 0.05, and 0.10, but in high-risk decisions (e.g., medicine) we may choose smaller to reduce Type I error.
常用的显著性水平有 0.01、0.05、0.10,但在高风险领域(如医疗)会选择更小的 来减小第一类错误的概率。

Image/Data Analysis(图像或数据分析)
图中说明:当 小于或等于显著性水平 (如 0.05)时,拒绝 并支持 ,形成“p 值法”的具体决策规则。


🔹Knowledge Point 3 — p-value ranges and evidence strength(p 值区间与证据强弱)

Explanation(解释)
Different ranges of p-values correspond to different strengths of evidence against .
不同的 p 值区间对应于对 不同强度的反对证据。

Example(例子)
According to the table:

  • : very strong evidence for
  • : strong evidence
  • : acceptable evidence
  • : insufficient evidence
    根据表格:
  • :对 的证据非常强;
  • :证据强;
  • :证据尚可;
  • :证据不足。

Extension(拓展)
These cutoffs are conventions, not rigid rules; practical significance and context should also be considered when making decisions.
这些阈值只是统计上的习惯,并非绝对规则,实际决策还要结合效应大小和具体情境。

Image/Data Analysis(图像或数据分析)
表格左列列出四种 p 值条件,右列给出相应解释:“Very strong evidence”、“Strong evidence”、“Acceptable evidence”、“Insufficient evidence”,帮助学生从数值到语言理解 p 值。


Summary(总结)
The p-value summarizes how incompatible the sample data are with ; by comparing it with and using evidence-strength guidelines, we decide whether to reject .
p 值概括了样本数据与原假设 的“不相容程度”,通过将其与 比较并参考证据强弱标准,我们可以做出是否拒绝 的统计决策。


Slide 9 — p-value and Rejection Region(第9页——p 值与拒绝域示例)

Knowledge Points (知识点)

  1. Sampling distribution of z under H₀(在原假设下 z 的抽样分布)
  2. Relationship among p-value, α, and rejection region(p 值、α 与拒绝域的关系)

🔹Knowledge Point 1 — Sampling distribution of z under H₀(在 H₀ 下 z 的抽样分布)

Explanation(解释)
When testing the mean with known σ, the test statistic

follows the standard normal distribution if is true.
当总体标准差已知时,检验统计量

在原假设 为真时服从标准正态分布。

Example(例子)
In the figure, the yellow curve is the standard normal distribution of z under . Any observed z from the sample is one point on this curve.
图中的黄色曲线就是在 为真时 z 的标准正态分布,样本算出的 z 是曲线上的一个点。

Extension(拓展)
This idea generalizes: for many tests, if is true, the test statistic has a known distribution; we use its tail areas to compute p-values.
这一思想可以推广:在很多检验中,只要 为真,检验统计量就有已知分布,我们用尾部面积计算 p 值。

Image/Data Analysis(图像或数据分析)
图中标出了 z 的水平轴,0 位于中心,左右对称,表示在 下大部分 z 落在接近 0 的区域,两侧尾部面积较小。


🔹Knowledge Point 2 — Relationship among p-value, α, and rejection region(p 值、α 与拒绝域)

Explanation(解释)
For a left-tailed test with significance level , the critical value defines the boundary of the rejection region. If the observed p-value is less than α, we reject .
在显著性水平 的左尾检验中,临界值 决定拒绝域的边界;若观测到的 p 值小于 α,就拒绝

Example(例子)
In the figure,

  • , so the left-tail critical value is ;
  • Observed is further into the left tail;
  • Corresponding p-value is , so .
    Therefore, we reject and support in this example.
    图中:
  • ,左尾临界值约为
  • 观测到的 更靠近左尾;
  • 对应 p 值为 ,满足
    因此在该例中我们拒绝 ,支持

Extension(拓展)
The p-value and critical-value methods are equivalent:

  • p-value method compares “tail area” p with α;
  • critical-value method compares observed z with or .
    p 值法与临界值法是等价的:
  • p 值法比较尾部面积 p 与 α;
  • 临界值法比较观测 z 与

Image/Data Analysis(图像或数据分析)
图中蓝色小块表示 p 值(面积 0.072),大一点的黄蓝合并左尾区域面积为 α=0.10;z=-1.46 位于蓝色区域内部,而 位于蓝色右边,清楚展示“p 值 < α → z 比临界值更极端”。


Summary(总结)
Under , z follows the standard normal distribution; if the left-tail p-value (area beyond observed z) is smaller than α, z lies in the rejection region and we reject .
下 z 服从标准正态分布,当左尾的 p 值(从观测 z 起的尾部面积)小于 α 时,z 落入拒绝域,我们就拒绝


Slide 10 — Hypothesis Testing Using z-Score (Two-Tailed)(第10页——利用 z 分数进行双尾检验)

Knowledge Points (知识点)

  1. z-score and its link to p-value(z 分数与 p 值的联系)
  2. Critical values for two-tailed tests(双尾检验的临界值)
  3. Matching p-value ranges with |z| ranges(p 值区间与 |z| 区间对应)

🔹Knowledge Point 1 — z-score and p-value(z 分数与 p 值)

Explanation(解释)
A z-score from the standard normal distribution corresponds to a tail probability p; in hypothesis testing, this p is the p-value.
标准正态分布中的 z 分数对应一个尾部概率 p,在假设检验中,这个 p 就是 p 值。

Example(例子)
For a two-tailed test at , each tail has area . The critical values are

在显著性水平 的双尾检验中,每个尾部面积为 ,相应临界值约为

Extension(拓展)
Software or Excel (e.g., norm.s.inv(0.025)) is often used to find z-critical values corresponding to tail probabilities such as 0.025 or 0.005.
在实际中常用软件或 Excel(如 norm.s.inv(0.025))来求解对应于 0.025、0.005 等尾部概率的 z 临界值。

Image/Data Analysis(图像或数据分析)
表格标题为 “Condition (p), Condition (z), Interpretation”,左列是 p 的区间,右列给出对应的 的比较,说明 p 与 z 的一一对应关系。


🔹Knowledge Point 2 — Critical values in two-tailed tests(双尾检验中的临界值)

Explanation(解释)
For a two-tailed test, the rejection regions are both tails where .
在双尾检验中,拒绝域位于两侧尾部,当 时拒绝

Example(例子)
If , the table shows this is equivalent to

,表中给出的等价条件就是

Extension(拓展)
Using emphasizes that for two-tailed tests, large positive or negative z both provide evidence against .
使用 是因为在双尾检验中,无论 z 很大还是很小(绝对值大),都说明样本结果与 不相容。

Image/Data Analysis(图像或数据分析)
表格中:

  • 对应
  • 对应
  • 对应
  • 对应
    右侧一列用语言描述证据强弱。

🔹Knowledge Point 3 — Evidence strength via p and |z|(通过 p 与 |z| 判断证据强度)

Explanation(解释)
The table links smaller p and larger to stronger evidence that is true.
该表说明:p 越小、 越大,支持 的证据越强。

Example(例子)
If we obtain in a two-tailed test, then , so and we have strong evidence to reject .
若得到 ,则 ,对应 ,说明有较强证据拒绝

Extension(拓展)
This rule lets us use either p-values or critical values depending on which is more convenient, while reaching the same conclusion.
这条规则让我们可以根据方便程度选择 p 值法或临界值法,最终结论是一致的。

Image/Data Analysis(图像或数据分析)
表格右列解释为 “Very strong / Strong / Acceptable / Insufficient evidence to conclude is true”,帮助从数值过渡到文字判断。


Summary(总结)
For two-tailed tests, rejecting requires both a small p-value and a large exceeding , and these conditions express the same evidence against .
在双尾检验中,拒绝 的条件是 p 值很小、且 大于 ,这两种表述本质上是对同一证据的不同表达。


Slide 11 — Hypothesis Testing Using z-Score (One-Tailed)(第11页——利用 z 分数进行单尾检验)

Knowledge Points (知识点)

  1. Critical value for one-tailed tests(单尾检验的临界值
  2. Equivalence between p and z conditions(p 条件与 z 条件的等价关系)
  3. Interpreting evidence in one-tailed tests(单尾检验中证据强度的解释)

🔹Knowledge Point 1 — Critical value (单尾检验的临界值)

Explanation(解释)
In a one-tailed test, the rejection region lies entirely in one tail; the boundary is given by the critical value (right tail) or (left tail).
在单尾检验中,拒绝域全部位于一侧尾部,边界由临界值 (右尾)或 (左尾)决定。

Example(例子)
For in a right-tailed test,

We reject if .
在显著性水平 的右尾检验中,,只有当 时才拒绝

Extension(拓展)
Compared with a two-tailed test using the same α, the one-tailed critical value has a smaller magnitude because the full α is assigned to one side.
与同一 α 的双尾检验相比,单尾检验的临界值绝对值更小,因为全部 α 都集中在一侧尾部。

Image/Data Analysis(图像或数据分析)
表格说明:对于单尾检验,边界由 定义,标题中强调 “critical value, ±zα”,并给出了左侧单尾的示例。


🔹Knowledge Point 2 — p vs. z conditions(p 与 z 条件)

Explanation(解释)
For one-tailed tests, the p-value condition and z condition are equivalent:

  • If , then (for the relevant tail).
  • If , then .
    在单尾检验中,p 值条件与 z 条件是等价的:
  • ,则对应 (在相应尾部);
  • ,则对应

Example(例子)
If a left-tailed test gives , then but ; this corresponds to , giving “acceptable” but not “strong” evidence.
若左尾检验得到 ,则 介于 1.28 与 1.64 之间,对应 ,说明证据“可接受但不算很强”。

Extension(拓展)
Using the z table or Excel, we can move freely between z and p to describe the same test result in different ways.
通过查 z 表或使用 Excel,可以在 z 与 p 之间相互转换,用不同方式描述同一个检验结果。

Image/Data Analysis(图像或数据分析)
表格列出了:

  • 对应
  • 对应
  • 对应
  • 对应
    并在右列给出对 的证据强度。

🔹Knowledge Point 3 — Evidence strength in one-tailed tests(单尾检验中的证据强弱)

Explanation(解释)
As in the two-tailed case, smaller p (or larger ) means stronger evidence that is true, but now all evidence is concentrated in one direction.
与双尾类似,p 越小(或 越大)说明支持 的证据越强,只是现在证据集中在一个方向上。

Example(例子)
In quality control, a left-tailed test with (or ) gives very strong evidence the process mean is below the target value.
在质量控制中,若左尾检验的 (或 ),就有非常强的证据表明过程均值低于目标值。

Extension(拓展)
Because one-tailed tests assign all α to one tail, they are more powerful in detecting directional effects but cannot detect changes in the opposite direction.
由于单尾检验把全部 α 分配在一侧,检测指定方向差异的能力更强,但无法发现相反方向的变化。

Image/Data Analysis(图像或数据分析)
表格右列解释仍为 “Very strong / Strong / Acceptable / Insufficient evidence to conclude is true”,强调如何用语言总结单尾检验的结果。


Summary(总结)
In one-tailed tests, we compare p with α or z with : small p or large beyond provides directional evidence to reject in favor of .
在单尾检验中,通过比较 p 与 α 或比较 z 与 ,当 p 足够小或 大于 时,我们就有方向性的证据来拒绝 、支持


Slide 12 — Hypothesis Testing Procedure (σ Known)(第12页——已知 σ 时的假设检验步骤)

Knowledge Points (知识点)

  1. Five-step z-test procedure(z 检验的五个步骤)
  2. Role of significance level α and critical value(显著性水平与临界值的作用)
  3. Using z-value vs. z-critical(z 值与 z 临界值的比较)

🔹Knowledge Point 1 — Five-step z-test procedure(z 检验五步骤)

Explanation(解释)
When σ is known, hypothesis testing for the mean follows a standard five-step z-test procedure.
当总体标准差 σ 已知时,对均值的假设检验通常遵循标准的五步 z 检验流程。

Example(例子)
Steps shown on the slide:

  1. Develop and .
  2. Choose significance level α and whether the test is one- or two-tailed.
  3. Collect sample data and compute the test statistic
  4. Use α to find the critical value(s) or .
  5. Compare the z-value to the critical value(s) and decide whether to reject .
    课件中的五步:
  6. 写出
  7. 选定显著性水平 α,并确定是单尾还是双尾;
  8. 收集样本并计算 z 值
  9. 根据 α 求出临界值
  10. 比较 z 值与临界值,决定是否拒绝

Extension(拓展)
These steps can also be implemented with p-values: compute z, find its p-value, then compare p with α instead of using critical values.
同样的流程也可用 p 值实现:先算 z,再求出对应的 p 值,最后比较 p 与 α,而不必显式计算临界值。

Image/Data Analysis(图像或数据分析)
幻灯片右侧橙色框写着 “Use z-value (vs. z-critical)”,提醒学生可以使用 z 值配合 p 值,而不是只依赖临界值表。


🔹Knowledge Point 2 — Role of α and critical value(α 与临界值的作用)

Explanation(解释)
The significance level α controls the probability of Type I error and determines the cutoff point(s) separating “likely under ” from “unlikely under .”
显著性水平 α 控制第一类错误的概率,并决定“在 下可能发生”与“不太可能发生”之间的分界点。

Example(例子)
If α changes from 0.05 to 0.01, the critical values move farther into the tails, making it harder to reject but reducing Type I error risk.
例如将 α 从 0.05 降至 0.01,临界值会更远离中心,使得拒绝 更困难,但第一类错误风险更小。

Extension(拓展)
Choice of α should reflect the cost of Type I error vs. Type II error in the real decision context.
选择 α 时要考虑现实情境中第一类错误与第二类错误的代价孰高孰低。

Image/Data Analysis(图像或数据分析)
步骤 4 特别强调 “use the level of significance α to determine the critical value”,说明 α 和临界值是衔接理论假设与实际决策的关键环节。


🔹Knowledge Point 3 — Using z-value vs. z-critical(使用 z 值与 z 临界值)

Explanation(解释)
The slide notes “Use z-value (vs. z-critical)” to highlight that we compute an actual z from data and compare it to theoretical critical values from the standard normal distribution.
幻灯片提示“Use z-value (vs. z-critical)”是为了强调:我们从样本计算得到一个实际的 z 值,再与标准正态分布中给出的理论临界值进行比较。

Example(例子)
Suppose in a right-tailed test with α = 0.05. Since , and , we reject .
例如在 α = 0.05 的右尾检验中,若 ,因 ,所以拒绝

Extension(拓展)
In practice we usually report both the test statistic (z-value) and its p-value to provide a complete picture of the evidence.
实际报告中通常同时给出检验统计量 z 和对应的 p 值,以全面展示证据的大小。

Image/Data Analysis(图像或数据分析)
幻灯片结构清楚列出五个步骤,z 公式被加粗显示,强调其在整个检验过程中的核心地位。


Summary(总结)
When σ is known, hypothesis testing for the mean follows a five-step z-test: specify /, choose α, compute z, find critical values (or p-value), and then compare to decide whether to reject .
在总体标准差已知的情况下,均值的假设检验遵循五步 z 流程:设定 /、选择 α、计算 z、找到临界值或 p 值,再比较并决定是否拒绝


Slide 13 — Two-Tailed z-Value and p-Value Approach(第13页——双尾检验中的 z 值与 p 值方法)

Knowledge Points (知识点)

  1. Two-tailed hypothesis test with α = 0.05(显著性水平为 0.05 的双尾检验)
  2. Relationship among z-value, critical value, and p-value(z 值、临界值与 p 值的关系)
  3. Strong evidence to reject H₀ in both tails(在两侧尾部拒绝 H₀ 的强证据)

🔹Knowledge Point 1 — Two-tailed test with α = 0.05(显著性水平 0.05 的双尾检验)

Explanation(解释)
For a two-tailed test with significance level , each tail has area , and the critical values are

在显著性水平 的双尾检验中,每一侧尾部的面积为 ,对应的临界值为

Example(例子)
We test

so any large positive or negative deviation of from can lead to rejection of .
我们检验

因此样本均值 无论向上还是向下明显偏离 都可能导致拒绝

Extension(拓展)
Two-tailed tests are used when we are interested in “any difference,” not just an increase or decrease.
当我们关心的是“是否有差异”而不是单纯“变大或变小”时,应使用双尾检验。

Image/Data Analysis(图像或数据分析)
图中绿色曲线表示在 为真时 z 的抽样分布;左右两端阴影区域各为 ,临界点分别为 。只要观测 z 落在这两个临界点之外,就位于拒绝域(圆圈“Reject H₀”所示)。


🔹Knowledge Point 2 — z-value, p-value and rejection(z 值、p 值与拒绝决策)

Explanation(解释)
In the example, the observed z-values are and ; each tail area is

so the total p-value is .
在该例中,观测到的 z 值为 ;每一侧尾部面积为

因此总 p 值为

Example(例子)
Because and , we reject and conclude is significantly different from .
由于 ,且 ,我们拒绝 ,认为 存在显著差异。

Extension(拓展)
This illustrates that the z-value approach (compare to ) and the p-value approach (compare to ) give exactly the same decision.
该图说明 z 值法(比较 )与 p 值法(比较 p 与 α)在结论上完全一致。

Image/Data Analysis(图像或数据分析)
图中 z = ±2.74 位于两侧深色小尾部中,而临界点 ±1.96 介于 0 与 ±2.74 之间;蓝色气泡“Reject H₀”表明在两侧极端区域均拒绝原假设。


Summary(总结)
In a two-tailed test with , z-values beyond ±1.96 correspond to very small p-values (如 p≈0.0062),提供强有力的证据拒绝
在显著性水平为 0.05 的双尾检验中,只要 |z| 超过 1.96,p 值就会很小(如 p≈0.0062),从而强烈支持拒绝


Slide 14 — Example of z-Test (σ Known, Upper Tail)(第14页——已知 σ 的 z 检验示例:右尾检验)

Knowledge Points (知识点)

  1. Setting up hypotheses for a performance goal(围绕绩效目标设定假设)
  2. Computing z-value and p-value for the sample(计算样本的 z 值与 p 值)
  3. Making decisions and interpreting results(作出检验结论并解释)

🔹Knowledge Point 1 — Hypotheses for process goal(围绕过程目标设定假设)

Explanation(解释)
We test whether the goal “average time per unit is 12 minutes or less” has been achieved.
必须检验“平均每件加工时间是否不超过 12 分钟”的目标是否达到。

Example(例子)
Population standard deviation is minutes, sample size is units.
We set

总体标准差为 分钟,样本量 。设定

这是一个右尾检验。

Extension(拓展)
Here represents “goal is met or better,” while represents “process is too slow,” which is the situation the manager is worried about.
此处 表示“目标已达成或更好”,而 表示“流程太慢、目标未达成”,恰好是管理者担心的情形。


🔹Knowledge Point 2 — z-test with sample mean 13.25(样本均值为 13.25 时的 z 检验)

Explanation(解释)
For sample mean , the z-value is

= \frac{13.25 - 12}{3.2/\sqrt{40}} = 2.47.$$ Sample p-value is $p = 0.0068$ for an upper-tail test. 当样本均值为 $\bar{x}_1 = 13.25$ 时, $$z_1 = \frac{13.25 - 12}{3.2/\sqrt{40}} = 2.47,$$ 对应右尾检验的 p 值为 $p = 0.0068$。 **Example(例子)** With $\alpha = 0.05$, we have $p = 0.0068 < 0.05$ and $z_1 = 2.47 > z_\alpha = 1.64$. Thus we reject $H_0$ and conclude the 12-minute goal has *not* been achieved. 在显著性水平 $\alpha = 0.05$ 下,$p = 0.0068 < 0.05$,且 $z_1 = 2.47 > z_\alpha = 1.64$, 因此拒绝 $H_0$,得出“12 分钟目标尚未实现”的结论。 **Extension(拓展)** Both z-critical method ($z_1 > z_\alpha$) and p-value method ($p < \alpha$) lead to the same decision. 无论是 z 临界值法($z_1 > z_\alpha$)还是 p 值法($p < \alpha$)都给出相同结论。 --- ### 🔹Knowledge Point 3 — z-test with sample mean 12.5(样本均值为 12.5 时的 z 检验) **Explanation(解释)** If the sample mean were $\bar{x}_2 = 12.5$, then $$z_2 = \frac{\bar{x}_2 - \mu_0}{\sigma/\sqrt{n}} = \frac{12.5 - 12}{3.2/\sqrt{40}} = 0.99.$$ The corresponding p-value is $p = 0.1611$. 若样本均值为 $\bar{x}_2 = 12.5$,则 $$z_2 = \frac{12.5 - 12}{3.2/\sqrt{40}} = 0.99,$$ 对应的 p 值为 $p = 0.1611$。 **Example(例子)** Now $p = 0.1611 > 0.05$ and $z_2 = 0.99 < z_\alpha = 1.64$, so we cannot reject $H_0$ and conclude the 12-minute goal is achieved. 此时 $p = 0.1611 > 0.05$ 且 $z_2 = 0.99 < 1.64$,所以不能拒绝 $H_0$,结论是“12 分钟目标可以认为已经实现”。 **Extension(拓展)** This comparison shows how different sample means lead to very different conclusions about the same process, highlighting the role of sampling variability. 两个不同的样本均值导致完全不同的检验结果,说明抽样波动对决策有重要影响。 **Image/Data Analysis(图像或数据分析)** 幻灯片右侧分别标注“Use z-value (vs. z-critical)”和“Use p-value (vs. α)”,提醒可以用两种等价方式解读同一 z 检验示例。 --- **Summary(总结)** For an upper-tail z-test with known σ, we transform the sample mean into a z-value,获取对应 p 值,并将其与 α 或 z 临界值比较,从而判断绩效目标是否达成。 在已知 σ 的右尾检验中,我们把样本均值转化为 z 值,找到对应的 p 值,再与 α 或 z 临界值比较,以判断流程目标是否实现。 --- # Slide 15 — Two-Tailed p-Value Approach(第15页——双尾检验的 p 值方法) ## Knowledge Points (知识点) 1. Sampling distribution and rejection regions(抽样分布与拒绝域) 2. Decision rule using p-value in a two-tailed test(双尾检验中基于 p 值的决策规则) 3. Consistency with critical-value approach(与临界值方法的一致性) --- ### 🔹Knowledge Point 1 — Sampling distribution and regions(抽样分布与区域划分) **Explanation(解释)** Under $H_0:\mu = \mu_0$, the test statistic $$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$$ follows the standard normal distribution. 在 $H_0:\mu = \mu_0$ 下,检验统计量 $$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$$ 服从标准正态分布。 **Example(例子)** With $\alpha = 0.05$ for a two-tailed test, the central region between $-1.96$ and $1.96$ is the “Do Not Reject $H_0$” region; both tails beyond ±1.96 are rejection regions. 在双尾检验且 $\alpha = 0.05$ 时,区间 $(-1.96, 1.96)$ 为“不拒绝 $H_0$ 区域”,两端超出 ±1.96 的尾部为拒绝域。 **Extension(拓展)** This visualization helps see that even moderate deviations of z from 0 are still likely under $H_0$ and therefore should *not* automatically lead to rejection. 该图直观地展示:z 的适度偏离仍然常见,并不能轻易拒绝 $H_0$,只有落入尾部极端区域才会拒绝。 **Image/Data Analysis(图像或数据分析)** 图中绿色主体标注为 “Do Not Reject H₀”,两侧小白色尾部标注 “Reject H₀”,每个尾部面积为 $\alpha/2 = 0.025$,对应 z = −1.96 与 z = 1.96。 --- ### 🔹Knowledge Point 2 — Decision rule using p-value(基于 p 值的双尾决策规则) **Explanation(解释)** For a two-tailed test, the p-value is the total area in both tails beyond the observed $|z|$. If $p \le \alpha$, reject $H_0$; otherwise, do not reject $H_0$. 在双尾检验中,p 值是“超出观测 $|z|$ 的两侧尾部面积之和”。 若 $p \le \alpha$,则拒绝 $H_0$;否则不拒绝 $H_0$。 **Example(例子)** If $z = 2.10$, then p is the area to the right of 2.10 plus the area to the left of −2.10. When this total p is less than 0.05, we reject $H_0$. 例如 $z = 2.10$ 时,p 值是右侧 z≥2.10 的面积加上左侧 z≤−2.10 的面积;若总面积小于 0.05,就拒绝 $H_0$。 **Extension(拓展)** The p-value approach works for any test where we know the sampling distribution; modern software typically reports p-values directly. 只要知道检验统计量的分布,就可以用 p 值方法;现代统计软件通常直接给出 p 值,使用非常方便。 **Image/Data Analysis(图像或数据分析)** 灰色框标注 “p-value approach (two-tailed) α=0.05”,蓝色框再次写出 $H_0:\mu=\mu_0, H_a:\mu\neq\mu_0$,强调该图是专门解释双尾 p 值法的。 --- ### 🔹Knowledge Point 3 — Consistency with critical values(与临界值法的一致性) **Explanation(解释)** For $\alpha = 0.05$, the rule “reject $H_0$ if $p \le 0.05$” is equivalent to “reject $H_0$ if $|z| \ge 1.96$.” 在 α = 0.05 下,“若 p ≤ 0.05 则拒绝 $H_0$” 与“若 $|z| ≥ 1.96$ 则拒绝 $H_0$”是等价的。 **Example(例子)** When $z = 1.50$, we have $|z| < 1.96$ and p ≈ 0.13 > 0.05, so we do not reject $H_0$. 当 $z = 1.50$ 时,$|z| < 1.96$ 且 p≈0.13>0.05,因此不拒绝 $H_0$。 **Extension(拓展)** Choosing between p-value and critical-value methods is mainly a matter of convenience and reporting style, not of substance. 在实际应用中,选择 p 值法还是临界值法主要取决于方便程度和报告习惯,本质上没有区别。 **Image/Data Analysis(图像或数据分析)** 图像中间区域用大字“Do Not Reject H₀”,说明当 z 在 −1.96 和 1.96 之间时,p 值一定大于 0.05;只有落入两侧细小阴影时 p 才小于 0.05。 --- **Summary(总结)** In a two-tailed test, the p-value equals the combined tail area beyond $|z|$;若该面积不超过 α,就说明样本结果过于极端,需要拒绝 $H_0$。 在双尾检验中,p 值就是 $|z|$ 以外两侧尾部的总面积;当这部分面积不超过 α 时,说明样本结果十分极端,应拒绝原假设。 --- # Slide 16 — Confidence Interval Approach to Hypothesis Testing(第16页——利用置信区间进行假设检验) ## Knowledge Points (知识点) 1. Using confidence intervals to test hypotheses(利用置信区间进行假设检验的思路) 2. Decision rule: contain vs. not contain μ₀(“是否包含 μ₀” 的决策规则) 3. Example comparing two sample means(比较两个样本均值的示例) --- ### 🔹Knowledge Point 1 — CI-based testing idea(基于置信区间的检验思想) **Explanation(解释)** We draw a simple random sample, compute the sample mean $\bar{x}$, and construct a confidence interval for the population mean $\mu$ at a given confidence level (e.g., 95%). 我们从总体中抽取简单随机样本,计算样本均值 $\bar{x}$,并在某个置信水平(如 95%)下构造总体均值 $\mu$ 的置信区间。 **Example(例子)** A 95% confidence interval for $\mu$ is $$\bar{x} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}},\quad \alpha = 0.05.$$ 例如 95% 置信区间为 $$\bar{x} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}},\quad \alpha = 0.05。$$ **Extension(拓展)** This approach uses the same information as the z-test but presents results as a range of plausible values for $\mu$. 这种方法使用的信息与 z 检验相同,但把结果表达为 $\mu$ 的一个“合理取值区间”。 --- ### 🔹Knowledge Point 2 — Decision rule using CI(利用置信区间的决策规则) **Explanation(解释)** To test $H_0:\mu = \mu_0$ (or $\mu \le \mu_0$), we check whether the hypothesized value $\mu_0$ lies inside the confidence interval. If the CI *contains* $\mu_0$, do not reject $H_0$; if it does *not contain* $\mu_0$, reject $H_0$. 在检验 $H_0:\mu = \mu_0$(或 $\mu \le \mu_0$)时,判断假设值 $\mu_0$ 是否落在置信区间内: 若置信区间“包含” $\mu_0$,则不拒绝 $H_0$;若“未包含” $\mu_0$,则拒绝 $H_0$。 **Example(例子)** This is equivalent to the z-test at the same α, because the CI endpoints are determined by the same critical values $z_{\alpha/2}$. 这一规则与同一 α 水平下的 z 检验等价,因为置信区间的端点正是由 $z_{\alpha/2}$ 决定的。 **Extension(拓展)** CI-based decisions are often easier to explain to non-statisticians: they show not only whether we reject $H_0$ but also how far the plausible range is from $\mu_0$. 置信区间法对非统计专业人士更直观:它不仅告诉我们是否拒绝 $H_0$,还展示“合理区间”与 $\mu_0$ 相差多远。 --- ### 🔹Knowledge Point 3 — Example with mean 13.25 vs. 12.5(样本均值 13.25 与 12.5 的示例) **Explanation(解释)** Given $\mu_0 = 12$, $\sigma = 3.2$, $n = 40$, $\alpha = 0.05$ ($z_{\alpha/2}=1.96$): 1. For $\bar{x}_1 = 13.25$: $$\bar{x}_1 \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}} = 13.25 \pm 1.96\frac{3.2}{\sqrt{40}} = 13.25 \pm 0.992 = (12.26,\ 14.24).$$ 2. For $\bar{x}_2 = 12.5$: $$\bar{x}_2 \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}} = 12.5 \pm 1.96\frac{3.2}{\sqrt{40}} = 12.5 \pm 0.992 = (11.51,\ 13.49).$$ 在 $\mu_0 = 12,\ \sigma = 3.2,\ n = 40,\ \alpha = 0.05$ 条件下: 3. 若 $\bar{x}_1 = 13.25$,得到置信区间 $(12.26, 14.24)$; 4. 若 $\bar{x}_2 = 12.5$,得到置信区间 $(11.51, 13.49)$。 **Example(例子)** - In case 1, $\mu_0 = 12$ is *not* contained in $(12.26, 14.24)$, so we reject $H_0$ → goal not achieved. - In case 2, $\mu_0 = 12$ *is* contained in $(11.51, 13.49)$, so we do not reject $H_0$ → goal achieved. - 情形 1:12 不在区间 $(12.26, 14.24)$ 内,因此拒绝 $H_0$,认为目标未实现; - 情形 2:12 落在区间 $(11.51, 13.49)$ 内,因此不拒绝 $H_0$,认为目标可以视为实现。 **Extension(拓展)** Notice that these CI-based conclusions match the earlier z-test results for $\bar{x}_1$ 和 $\bar{x}_2$,说明两种方法完全一致。 可以看到,这里的置信区间结论与之前对 $\bar{x}_1$ 与 $\bar{x}_2$ 的 z 检验结果完全一致,证明两种方法在理论上的等价性。 **Image/Data Analysis(图像或数据分析)** 示例幻灯片中分两行列出了两个区间,并用粗体标出 “is not contained” 与 “is contained”,直观展示“是否包含 12”与“拒绝/不拒绝 $H_0$”之间的对应关系。 --- **Summary(总结)** The confidence-interval approach tests $H_0$ by checking whether $\mu_0$ lies inside a CI for $\mu$;若不在区间内则拒绝 $H_0$,其结论与同 α 水平下的 z 检验完全一致。 置信区间法通过检查 $\mu_0$ 是否落在总体均值的置信区间内来检验 $H_0$;一旦不在区间内就拒绝 $H_0$,且与对应的 z 检验结果完全相同。