Lecture 14 — Sampling and Sampling Distributions

第14讲——抽样与抽样分布

Slide 1 — Sampling and Sampling Distributions

第1页——抽样与抽样分布

Knowledge Points (知识点)

Sampling（抽样）
Sampling distribution（抽样分布）
Key measures: $\overset{x}{ˉ}$ , $\overset{p}{^}$ （关键统计量）

Knowledge Point 1 — Sampling（抽样）

Explanation（解释）
Sampling is the process of selecting a subset of a population to estimate population characteristics.
抽样是从总体中选择部分个体，以推断总体特征的统计过程。
Example（例子）
A researcher surveys 100 WKU students instead of all students to estimate average weekly spending.
研究者通过调查100名温州肯恩大学学生，而非全体学生，以估计平均每周支出。
Extension（拓展）
Sampling reduces cost and time, and when properly designed, yields accurate estimates of population parameters.
抽样可以节约成本和时间，只要设计合理，也能提供对总体参数的准确估计。
Summary（总结）
Sampling allows inference about the whole population without a complete census.
抽样能在不进行全面调查的情况下推断总体特征。

Knowledge Point 2 — Sampling Distribution（抽样分布）

Explanation（解释）
A sampling distribution shows how a sample statistic varies when repeated samples are drawn from the same population.
抽样分布描述了在重复抽样时样本统计量的变动规律。
Example（例子）
Drawing multiple samples of 50 students each and calculating their mean GPA produces a distribution of sample means — the sampling distribution of $\overset{x}{ˉ}$ .
若多次抽取50名学生并计算平均绩点，这些均值的分布即为样本均值 $\overset{x}{ˉ}$ 的抽样分布。
Extension（拓展）
The concept of sampling distribution is the foundation for understanding probability-based inference such as confidence intervals and hypothesis testing.
抽样分布概念是理解基于概率的统计推断（如置信区间与假设检验）的基础。
Summary（总结）
Sampling distribution connects sample variability with population characteristics through probability.
抽样分布通过概率理论将样本变异与总体特征联系起来。

Knowledge Point 3 — Key Measures: $\overset{x}{ˉ}$ and $\overset{p}{^}$ （关键统计量）

Explanation（解释）
The sample mean $\overset{x}{ˉ}$ estimates population mean $μ$ ; the sample proportion $\overset{p}{^}$ estimates population proportion $p$ .
样本均值 $\overset{x}{ˉ}$ 用于估计总体均值 $μ$ ，样本比例 $\overset{p}{^}$ 用于估计总体比例 $p$ 。
Example（例子）
If 60 of 100 surveyed students use mobile payments, then $\overset{p}{^} = 0.6$ , an estimate for $p$ .
若100名学生中有60人使用移动支付，则 $\overset{p}{^} = 0.6$ ，为总体比例 $p$ 的估计。
Extension（拓展）
These sample statistics serve as point estimators that summarize information about the population.
这些样本统计量作为点估计量，用于概括总体信息。
Summary（总结）
$\overset{x}{ˉ}$ and $\overset{p}{^}$ are central statistics for inference about means and proportions.
$\overset{x}{ˉ}$ 与 $\overset{p}{^}$ 是均值与比例推断中的核心统计量。

Slide 2 — Selecting a Sample (Finite Population)

第2页——抽样选择（有限总体）

Knowledge Points (知识点)

Finite population（有限总体）
Equal chance selection（等概率抽样）
Randomness and representativeness（随机性与代表性）

Knowledge Point 1 — Finite Population（有限总体）

Explanation（解释）
A finite population has a fixed and countable number of elements, such as 900 applicants or 500 customers.
有限总体由固定数量的成员构成，如900名申请者或500位顾客。
Example（例子）
St. Andrew’s College has $N = 900$ applications, and a sample of $n = 30$ is randomly selected for review.
圣安德鲁学院共有 $N = 900$ 份申请，随机选取 $n = 30$ 名申请者进行评估。
Extension（拓展）
Sampling from a finite population allows for exact probability assignment and control over bias.
从有限总体抽样能精确控制概率与偏差，结果更可重复。
Summary（总结）
Finite populations provide a known frame for random selection, ensuring every member has a measurable chance.
有限总体提供明确抽样框，使每个成员都有可测的选中概率。

Knowledge Point 2 — Equal Chance Selection（等概率抽样）

Explanation（解释）
Every element in the population has an equal probability of being chosen.
总体中的每个成员被选中的概率相同。
Example（例子）
Assign numbers 1–900 to applicants and use a random number generator to select 30 IDs.
将900位申请者编号为1至900，并使用随机数生成器抽取30人。
Extension（拓展）
Equal chance sampling ensures fairness and prevents selection bias.
等概率抽样可保证抽样公平，避免选择偏差。
Summary（总结）
Equal probability is the foundation of valid random sampling.
等概率原则是随机抽样的基本条件。

Knowledge Point 3 — Randomness and Representativeness（随机性与代表性）

Explanation（解释）
A representative sample accurately reflects population characteristics; randomness helps achieve that.
具有代表性的样本能准确反映总体特征，而随机性能帮助实现这一目标。
Example（例子）
Selecting 30 students randomly from different majors avoids concentration bias.
从不同专业随机选取30名学生可避免样本集中偏差。
Extension（拓展）
Representativeness depends on both random selection and sufficient sample size.
样本代表性取决于随机抽样与样本量的充足。
Summary（总结）
A random and representative sample supports accurate statistical inference.
随机且具代表性的样本能提供更准确的推断。

Slide 3 — Selecting a Sample (Infinite Population)

第3页——抽样选择（无限总体）

Knowledge Points (知识点)

Infinite population（无限总体）
Random sampling from infinite population（无限总体的随机抽样）
Independence of observations（观测独立性）

Knowledge Point 1 — Infinite Population（无限总体）

Explanation（解释）
An infinite population occurs when new elements continuously appear, e.g., production output or customer arrivals.
无限总体指不断生成新数据的总体，如生产线输出或顾客到达。
Example（例子）
A company monitoring daily website visits treats each new visit as part of an infinite population.
公司监测每日网站访问量时，将每次访问视为无限总体的一部分。
Extension（拓展）
Infinite populations are modeled probabilistically since their total number is unobservable.
无限总体无法穷尽，只能通过概率模型进行刻画。
Summary（总结）
Infinite populations describe processes rather than finite groups.
无限总体用于描述持续过程而非固定群体。

Knowledge Point 2 — Random Sampling from Infinite Population（无限总体随机抽样）

Explanation（解释）
Sampling from an infinite population requires independent selection of each observation.
从无限总体抽样时，每个观测值必须独立选择。
Example（例子）
Selecting every 100th customer transaction in a continuous system maintains randomness.
在连续交易系统中，每隔100笔抽样一次能保持随机性。
Extension（拓展）
Independence ensures unbiased estimation and valid inference from streaming data.
保证样本独立能确保无偏估计与有效推断。
Summary（总结）
Independent random sampling is essential when dealing with infinite populations.
独立随机抽样是无限总体分析的必要条件。

Knowledge Point 3 — Independence of Observations（观测独立性）

Explanation（解释）
Independence means the selection of one element does not affect another’s chance.
独立性指一个样本被选中不会影响另一个样本的选中概率。
Example（例子）
When drawing data from automated sensors, each reading is independent of previous ones.
从自动传感器收集的数据中，每次测量结果相互独立。
Extension（拓展）
Violating independence leads to biased estimates and underestimated variability.
若独立性被破坏，将导致估计偏差与方差低估。
Summary（总结）
Independence preserves randomness and validity in sampling results.
样本独立性保证随机性与结果的有效性。

Slide 4 — Point Estimation

第4页——点估计

Knowledge Points (知识点)

Point estimation（点估计）
Sample statistic vs population parameter（样本统计量与总体参数）
Unbiasedness and efficiency（无偏性与有效性）

Knowledge Point 1 — Point Estimation（点估计）

Explanation（解释）
Point estimation uses sample statistics to estimate population parameters.
点估计是利用样本统计量来估计总体参数的过程。
Example（例子）
Use $\overset{x}{ˉ}$ to estimate $μ$ , or $\overset{p}{^}$ to estimate $p$ .
以样本均值 $\overset{x}{ˉ}$ 估计总体均值 $μ$ ，以样本比例 $\overset{p}{^}$ 估计总体比例 $p$ 。
Extension（拓展）
Good estimators should be unbiased ( $E (\overset{x}{ˉ}) = μ$ ), consistent, and efficient.
优秀的估计量应具备无偏性（一致性）与高效率。
Summary（总结）
Point estimation provides a single best guess of population parameters.
点估计为总体参数提供最佳单点估计。

Knowledge Point 2 — Sample Statistic vs Population Parameter（样本统计量与总体参数）

Sample Statistic（样本统计量）	Population Parameter（总体参数）
$\overset{x}{ˉ}$	$μ$
$s$	$σ$
$\overset{p}{^}$	$p$

Explanation（解释）
Sample statistics are used as estimators for corresponding population parameters.
样本统计量是相应总体参数的估计量。
Example（例子）
If the sample variance is $s^{2} = 4$ , it estimates population variance $σ^{2}$ .
若样本方差为 $s^{2} = 4$ ，则它用于估计总体方差 $σ^{2}$ 。
Extension（拓展）
These relationships are key to connecting data to theoretical parameters.
这些对应关系是从样本到总体推断的基础。
Summary（总结）
Each sample statistic provides information about a specific population characteristic.
每个样本统计量反映总体的某一特征。

Knowledge Point 3 — Unbiasedness and Efficiency（无偏性与有效性）

Explanation（解释）
An estimator is unbiased if its expected value equals the true parameter.
若估计量的期望值等于总体真实值，则称其为无偏。
Example（例子）
Since $E (\overset{x}{ˉ}) = μ$ , the sample mean is an unbiased estimator of the population mean.
因 $E (\overset{x}{ˉ}) = μ$ ，故样本均值是总体均值的无偏估计。
Extension（拓展）
Efficiency compares the variances of unbiased estimators; smaller variance means higher efficiency.
有效性比较无偏估计量的方差，方差越小效率越高。
Summary（总结）
Unbiased and efficient estimators yield reliable and precise statistical inferences.
无偏且有效的估计量能提供更精确的推断。

Slide 5 — Comments on Sampling

第5页——关于抽样的说明

Knowledge Points (知识点)

Population vs Sample（总体与样本）
Representativeness（样本代表性）
Sampling bias（抽样偏差）

Knowledge Point 1 — Population vs Sample（总体与样本）

Explanation（解释）
The population is the complete set of elements we want to study, while the sample is a smaller group taken from it.
总体是研究目标的全部成员，而样本是从总体中抽取的一部分。
Example（例子）
A company wants to know employee satisfaction across all 2,000 workers, but surveys only 200 randomly chosen employees.
某公司想了解2000名员工的满意度，仅随机调查200人作为样本。
Extension（拓展）
The closer the sample characteristics are to the population, the more valid the statistical inference.
样本特征越接近总体特征，统计推断结果越可靠。
Summary（总结）
Sampling provides manageable data while maintaining a link to the population’s properties.
抽样使数据更可管理，同时保持总体特征的代表性。

Knowledge Point 2 — Representativeness（样本代表性）

Explanation（解释）
Representativeness means the sample accurately mirrors the diversity and proportions of the population.
代表性意味着样本准确反映总体的多样性与比例。
Example（例子）
If 60% of WKU students are female, a representative sample should also have roughly 60% females.
若温肯大学学生中女性占60%，则具有代表性的样本中女性比例也应接近60%。
Extension（拓展）
Random sampling, stratified sampling, and sufficient sample size increase representativeness.
随机抽样、分层抽样及足够的样本量能提高样本代表性。
Summary（总结）
Representative samples ensure that sample statistics reflect true population parameters.
代表性样本能保证样本统计量准确反映总体参数。

Knowledge Point 3 — Sampling Bias（抽样偏差）

Explanation（解释）
Sampling bias occurs when the method of selection systematically favors certain outcomes.
抽样偏差指抽样方法系统性地偏向某一结果。
Example（例子）
Conducting an online survey about internet use may exclude individuals without online access.
通过网络问卷调查上网习惯会排除没有网络的人群，从而产生偏差。
Extension（拓展）
Bias can be reduced through randomization and strict adherence to sampling procedures.
通过随机化与规范抽样程序可减少偏差。
Summary（总结）
Avoiding sampling bias is crucial for reliable and generalizable conclusions.
避免抽样偏差是获得可靠、可推广结论的关键。

Slide 6 — Sampling Distribution of $\overset{x}{ˉ}$

第6页——样本均值的抽样分布

Knowledge Points (知识点)

Sampling distribution of $\overset{x}{ˉ}$ （样本均值的分布）
Expected value $E (\overset{x}{ˉ}) = μ$ （样本均值的期望）
Unbiased estimator（无偏估计量）

Knowledge Point 1 — Sampling Distribution of $\overset{x}{ˉ}$ （样本均值的分布）

Explanation（解释）
The sampling distribution of $\overset{x}{ˉ}$ shows the probability distribution of all possible sample means from samples of size $n$ .
样本均值 $\overset{x}{ˉ}$ 的抽样分布描述了所有样本量为 $n$ 的样本均值的概率分布。
Example（例子）
Suppose we take 100 different samples of 50 students each, compute each mean GPA, and plot their distribution—it forms the sampling distribution of $\overset{x}{ˉ}$ .
假设抽取100个样本（每个样本包含50名学生），计算各样本平均绩点，其分布即为 $\overset{x}{ˉ}$ 的抽样分布。
Extension（拓展）
The sampling distribution allows the application of probability to describe the variability of statistics.
抽样分布使我们能够用概率刻画统计量的变动性。
Summary（总结）
Sampling distributions bridge descriptive statistics and inferential analysis.
抽样分布连接了描述统计与推断统计。

Knowledge Point 2 — Expected Value $E (\overset{x}{ˉ}) = μ$ （样本均值的期望）

Explanation（解释）
The expected value of $\overset{x}{ˉ}$ equals the true population mean $μ$ :
$E (\overset{x}{ˉ}) = μ$
样本均值的期望等于总体均值 $μ$ ，即 $E (\overset{x}{ˉ}) = μ$ 。
Example（例子）
If the population mean height of students is $μ = 170$ cm, the average of sample means across repeated sampling will also be 170 cm.
若总体平均身高 $μ = 170$ cm，多次抽样计算样本均值的平均值也为170 cm。
Extension（拓展）
This property shows that the sample mean $\overset{x}{ˉ}$ is centered around $μ$ , ensuring unbiased estimation.
这一特征表明样本均值 $\overset{x}{ˉ}$ 以总体均值为中心，是无偏的。
Summary（总结）
The equality $E (\overset{x}{ˉ}) = μ$ is the mathematical foundation of unbiased estimation.
$E (\overset{x}{ˉ}) = μ$ 是无偏估计的重要数学基础。

Knowledge Point 3 — Unbiased Estimator（无偏估计量）

Explanation（解释）
An estimator is unbiased if its expected value equals the true parameter being estimated.
若估计量的期望值等于所估参数的真实值，则称其为无偏估计量。
Example（例子）
Since $E (\overset{x}{ˉ}) = μ$ , $\overset{x}{ˉ}$ is an unbiased estimator of $μ$ .
因 $E (\overset{x}{ˉ}) = μ$ ，所以 $\overset{x}{ˉ}$ 是 $μ$ 的无偏估计量。
Extension（拓展）
Other unbiased estimators include sample proportion $\overset{p}{^}$ for population $p$ , and sample variance $s^{2}$ for $σ^{2}$ .
其他无偏估计量还包括：样本比例 $\overset{p}{^}$ 估计总体比例 $p$ ，样本方差 $s^{2}$ 估计总体方差 $σ^{2}$ 。
Summary（总结）
Unbiasedness ensures estimators neither overestimate nor underestimate true values on average.
无偏性保证估计量在长期平均下既不高估也不低估总体参数。

Slide 7 — Making Statistical Inference

第7页——统计推断的过程

Knowledge Points (知识点)

Relationship between $\overset{x}{ˉ}$ and $μ$ （样本均值与总体均值的关系）
Statistical inference steps（统计推断步骤）
Estimation and decision making（估计与决策）

Knowledge Point 1 — Relationship between $\overset{x}{ˉ}$ and $μ$ （样本均值与总体均值）

Explanation（解释）
The sample mean $\overset{x}{ˉ}$ serves as a point estimator for the population mean $μ$ .
样本均值 $\overset{x}{ˉ}$ 是总体均值 $μ$ 的点估计量。
Example（例子）
If $\overset{x}{ˉ} = 3.4$ is obtained from a random sample of 50 students, we infer $μ \approx 3.4$ .
若从50名学生的样本中得到 $\overset{x}{ˉ} = 3.4$ ，则可推测总体均值 $μ \approx 3.4$ 。
Extension（拓展）
The reliability of $\overset{x}{ˉ}$ as an estimator depends on sample size $n$ and sampling variability.
$\overset{x}{ˉ}$ 作为估计量的可靠性取决于样本量 $n$ 与抽样变异程度。
Summary（总结）
$\overset{x}{ˉ}$ connects observed data to the theoretical population mean $μ$ .
样本均值 $\overset{x}{ˉ}$ 将观测数据与总体参数 $μ$ 联系起来。

Knowledge Point 2 — Statistical Inference Steps（统计推断步骤）

Explanation（解释）
Statistical inference converts sample information into knowledge about the population.
统计推断是将样本信息转化为总体结论的过程。
Example（例子）
Steps:
1️⃣ Select a random sample
2️⃣ Compute $\overset{x}{ˉ}$
3️⃣ Use $\overset{x}{ˉ}$ to estimate $μ$ or test hypotheses about $μ$ .
推断过程包括：① 随机抽样；② 计算样本均值；③ 用 $\overset{x}{ˉ}$ 推断或检验 $μ$ 。
Extension（拓展）
These steps form the foundation for hypothesis testing and confidence interval estimation.
该过程为假设检验与置信区间估计奠定基础。
Summary（总结）
Statistical inference generalizes sample findings to population conclusions.
统计推断使样本结论能推广至总体。

Knowledge Point 3 — Estimation and Decision Making（估计与决策）

Explanation（解释）
Once $\overset{x}{ˉ}$ estimates $μ$ , managers can use this estimate for strategic or policy decisions.
当 $\overset{x}{ˉ}$ 用于估计 $μ$ 后，管理者可据此进行战略或政策决策。
Example（例子）
A firm estimates average customer spending $\bar{x} = \$ 85 $; d ec i s i o n so n p r i c in g or in v e n t ory f o ll o wt hi ses t ima t e . 企业估计顾客平均消费为$ \bar{x} = $85$，据此调整定价与库存策略。
Extension（拓展）
Inferential results must be interpreted cautiously, considering sampling error and confidence level.
推断结果应结合抽样误差与置信水平谨慎解释。
Summary（总结）
Estimation links data analysis to actionable business and policy insights.
统计估计将数据分析转化为可操作的商业与政策见解。

Slide 8 — Sampling Distribution of $\overset{x}{ˉ}$ (Standard Error)

第8页——样本均值的抽样分布（标准误差）

Knowledge Points (知识点)

Definition of standard error（标准误差定义）
Finite population correction factor（有限总体修正系数）
Infinite population approximation（无限总体近似）

Knowledge Point 1 — Definition of Standard Error（标准误差定义）

Explanation（解释）
The standard deviation of the sampling distribution of $\overset{x}{ˉ}$ , denoted $σ_{\overset{x}{ˉ}}$ , is called the standard error of the mean.
样本均值抽样分布的标准差称为均值的标准误差，记作 $σ_{\overset{x}{ˉ}}$ 。
Example（例子）
For a population with $σ = 10$ and sample size $n = 25$ ,
$σ_{\overset{x}{ˉ}} = \frac{10}{25} = 2.$
Extension（拓展）
The standard error measures how much sample means vary around the population mean.
标准误差表示样本均值围绕总体均值的波动程度，反映估计精度。
Summary（总结）
Smaller $σ_{\overset{x}{ˉ}}$ means higher precision in estimating $μ$ .
$σ_{\overset{x}{ˉ}}$ 越小，估计总体均值 $μ$ 的精度越高。

Knowledge Point 2 — Finite Population Correction Factor（有限总体修正系数）

Explanation（解释）
For a finite population of size $N$ , the standard error is adjusted using the finite population correction factor (FPC):
$σ_{\overset{x}{ˉ}} = \frac{N - n}{N - 1} \times \frac{σ}{n}$
其中 $\frac{N - n}{N - 1}$ 即为有限总体修正系数。
Example（例子）
If $N = 1000$ , $n = 100$ , and $σ = 20$ , then
$σ_{\overset{x}{ˉ}} = \frac{900}{999} \times \frac{20}{100} \approx 1.90.$
Extension（拓展）
When $n / N > 0.05$ , the FPC must be applied to avoid overestimating variability.
当抽样比例 $n / N > 0.05$ 时，必须使用修正系数以防高估变异性。
Summary（总结）
FPC corrects the reduction of variability caused by sampling without replacement.
修正系数用于校正无放回抽样导致的样本依赖问题。

Knowledge Point 3 — Infinite Population Approximation（无限总体近似）

Explanation（解释）
For large or infinite populations,
$σ_{\overset{x}{ˉ}} = \frac{σ}{n}$
applies, since the effect of FPC becomes negligible.
Example（例子）
When $n / N \leq 0.05$ , we can treat the population as infinite.
若 $n / N \leq 0.05$ ，则可将总体视为无限总体。
Extension（拓展）
This assumption simplifies calculations while maintaining high accuracy for large $N$ .
对于较大总体，此假设能简化计算且精度高。
Summary（总结）
Infinite population assumption is reasonable when sample fraction is small.
当抽样比例极小，视总体为无限是合理近似。

Slide 9 — Normal Distribution of $\overset{x}{ˉ}$ (Central Limit Theorem)

第9页——样本均值的正态分布（中心极限定理）

Knowledge Points (知识点)

Sampling distribution of $\overset{x}{ˉ}$ （样本均值的分布）
Rule of sample size（样本量规则）
Central Limit Theorem（中心极限定理）

Knowledge Point 1 — Sampling Distribution of $\overset{x}{ˉ}$ （样本均值的分布）

Explanation（解释）
If the population follows a normal distribution, then $\overset{x}{ˉ}$ is normally distributed for any $n$ .
若总体服从正态分布，则样本均值 $\overset{x}{ˉ}$ 在任意样本量下也服从正态分布。
Example（例子）
If student IQ scores follow $N (100, 15)$ , then sample mean $\overset{x}{ˉ}$ also follows a normal distribution with the same mean.
若学生智商服从 $N (100, 15)$ ，则样本均值 $\overset{x}{ˉ}$ 亦服从均值为100的正态分布。
Extension（拓展）
This property simplifies inferential analysis for populations known to be normal.
若总体正态，则简化推断计算，可直接应用标准正态法则。
Summary（总结）
$\overset{x}{ˉ}$ retains normality when the population itself is normally distributed.
若总体正态，样本均值也保持正态。

Knowledge Point 2 — Rule of Sample Size（样本量规则）

Explanation（解释）
For non-normal populations, the sampling distribution of $\overset{x}{ˉ}$ approaches normal when sample size is large ( $n \geq 30$ ).
若总体非正态，当样本量较大（ $n \geq 30$ ）时，样本均值分布近似正态。
Example（例子）
In skewed income data, sample means of $n = 50$ or $n = 100$ approximate a normal curve.
对偏态收入数据，样本量50或100时，其样本均值分布近似正态。
Extension（拓展）
For highly skewed data or extreme outliers, a larger sample ( $n \geq 50$ ) is required.
若总体严重偏态或存在极端值，则应采用 $n \geq 50$ 的样本量。
Summary（总结）
Larger samples yield more symmetric and normal-like sampling distributions.
样本量越大，样本均值分布越接近正态。

Knowledge Point 3 — Central Limit Theorem（中心极限定理）

Explanation（解释）
The Central Limit Theorem (CLT) states that as $n$ increases, the sampling distribution of $\overset{x}{ˉ}$ approaches a normal distribution regardless of the population’s shape.
中心极限定理指出：当样本量 $n$ 增大时，无论总体分布形态如何，样本均值分布都会趋近于正态。
Example（例子）
Even if population sales data are skewed, sample means of $n = 40$ are nearly normal.
即使销售数据偏态，当样本量为40时，样本均值分布也接近正态。
Extension（拓展）
CLT justifies using normal models in estimation and hypothesis testing.
中心极限定理为估计与假设检验中使用正态模型提供理论依据。
Summary（总结）
CLT is the foundation for inferential statistics based on large samples.
中心极限定理是大样本统计推断的理论基础。

Slide 10 — Example: St Andrew’s College (SAT Distribution)

第10页——案例：圣安德鲁学院（SAT分数抽样分布）

Knowledge Points (知识点)

Sampling distribution of $\overset{x}{ˉ}$ （样本均值分布）
Standard error computation（标准误差计算）
Probability estimation（概率估计）

Knowledge Point 1 — Sampling Distribution of $\overset{x}{ˉ}$ （样本均值分布）

Explanation（解释）
For SAT scores with $μ = 1090$ , $σ = 80$ , and $n = 30$ , the mean of the sampling distribution is $E (\overset{x}{ˉ}) = 1090$ .
当 SAT 分数总体均值 $μ = 1090$ 、标准差 $σ = 80$ 、样本量 $n = 30$ 时，样本均值分布的期望为 $E (\overset{x}{ˉ}) = 1090$ 。
Example（例子）
$σ_{\overset{x}{ˉ}} = \frac{σ}{n} = \frac{80}{30} = 14.6$
均值标准误差为 14.6。
Extension（拓展）
The narrower the sampling distribution, the higher the precision of $\overset{x}{ˉ}$ .
抽样分布越窄，样本均值估计越精确。
Summary（总结）
$\overset{x}{ˉ}$ follows a normal distribution centered at $μ = 1090$ , with $σ_{\overset{x}{ˉ}} = 14.6$ .
样本均值服从均值为1090、标准误差为14.6的正态分布。

Knowledge Point 2 — Probability Estimation（概率估计）

Explanation（解释）
We want $P (∣ \overset{x}{ˉ} - μ ∣ \leq 10)$ for a sample of 30 students.
求样本均值与总体均值差不超过10的概率。
Example（例子）
$z = \frac{10}{14.6} = 0.685$
$P (- 0.685 < z < 0.685) = 0.7533 - 0.2467 = 0.5066$
概率约为 0.5066，即约 50.66%。
Extension（拓展）
About half of all 30-student samples will have mean SAT scores within ±10 of 1090.
约50%的30人样本均值会落在总体均值1090的±10范围内。
Summary（总结）
The example illustrates using the $z$ -score and normal probability to assess sampling accuracy.
此案例展示了如何用标准正态分布求样本均值精度概率。

Knowledge Point 3 — Statistical Interpretation（统计解释）

Explanation（解释）
A higher sample size reduces $σ_{\overset{x}{ˉ}}$ , increasing the probability that $\overset{x}{ˉ}$ is close to $μ$ .
样本量越大，标准误差越小，样本均值越可能接近总体均值。
Example（例子）
If $n$ doubles to 60, then $σ_{\overset{x}{ˉ}} = 80/ 60 \approx 10.3$ , and accuracy improves.
若样本量增至60，标准误差变为10.3，精确度提升。
Extension（拓展）
This relationship shows why large samples lead to more stable statistical inference.
该关系解释了为何大样本能带来更稳定的统计推断。
Summary（总结）
Increasing $n$ enhances confidence that $\overset{x}{ˉ}$ approximates $μ$ .
样本量越大，对 $\overset{x}{ˉ}$ 接近 $μ$ 的置信程度越高。

Slide 12 — Example: Effect of Sample Size on Sampling Distribution

第12页——样本量对抽样分布的影响

Knowledge Points (知识点)

Relationship between sample size and standard error（样本量与标准误差的关系）
Probability comparison ( $n = 30$ vs. $n = 100$ )（不同样本量下的概率比较）
Interpretation of variability（变异性的解释）

Knowledge Point 1 — Relationship Between Sample Size and Standard Error（样本量与标准误差的关系）

Explanation（解释）
As the sample size $n$ increases, the standard error $σ_{\overset{x}{ˉ}}$ decreases, making the sampling distribution narrower.
随着样本量 $n$ 的增加，标准误差 $σ_{\overset{x}{ˉ}}$ 会减小，从而使样本均值的分布更集中、更窄。
Example（例子）
When $n = 30$ , $σ_{\overset{x}{ˉ}} = 14.6$ ; when $n = 100$ , $σ_{\overset{x}{ˉ}} = 8$ .
样本量为30时，标准误差为14.6；样本量为100时，标准误差为8。
Extension（拓展）
This shows that larger samples provide more consistent estimates of $μ$ , reducing random fluctuations.
这说明样本量越大，对总体均值 $μ$ 的估计越稳定，随机波动越小。
Summary（总结）
Increasing sample size leads to higher precision in estimating population parameters.
增加样本量能显著提高总体参数估计的精确度。

Knowledge Point 2 — Probability Comparison: $n = 30$ vs. $n = 100$ （样本量下的概率比较）

Explanation（解释）
We compare the probability that $\overset{x}{ˉ}$ is within ±10 of $μ = 1090$ for two sample sizes.
比较在样本量不同（30与100）时，样本均值距总体均值 ±10 的概率。
Example（例子）
For $n = 30$ :
$z = \frac{10}{14.6} = 0.685 \Rightarrow P (- 0.685 < z < 0.685) = 0.5066$
For $n = 100$ :
$z = \frac{10}{8} = 1.25 \Rightarrow P (- 1.25 < z < 1.25) = 0.7887$
当 $n = 30$ 时，概率为 0.5066；当 $n = 100$ 时，概率提升至 0.7887。
Extension（拓展）
As $n$ increases, $\overset{x}{ˉ}$ becomes more concentrated around $μ$ , raising the probability of accurate estimation.
随着样本量增加，样本均值更集中于总体均值周围，估计的准确概率上升。
Summary（总结）
Larger sample size improves precision and confidence in estimation.
样本量越大，估计越精确，对结果的置信度越高。

Knowledge Point 3 — Interpretation of Variability（变异性的解释）

Explanation（解释）
The blue curve (n = 30) shows greater spread, while the yellow curve (n = 100) is narrower, meaning less variability.
蓝色曲线（n=30）分布更宽，黄色曲线（n=100）更窄，表示样本均值的变异性降低。
Example（例子）
With $σ_{\overset{x}{ˉ}} = 14.6$ , $\overset{x}{ˉ}$ values are more dispersed; with $σ_{\overset{x}{ˉ}} = 8$ , values cluster near $μ = 1090$ .
当标准误差为14.6时，样本均值分布较散；当标准误差降至8时，样本均值更集中在1090附近。
Extension（拓展）
Reduced variability increases reliability of conclusions drawn from the sample.
较小的变异性提高了基于样本得出的结论的可靠性。
Summary（总结）
Smaller $σ_{\overset{x}{ˉ}}$ means less uncertainty and higher confidence in results.
标准误差越小，不确定性越低，对结果的置信度越高。

Slide 13 — Relationship Between Sample Size and Sampling Distribution

第13页——样本量与抽样分布的关系总结

Knowledge Points (知识点)

Sample size effect（样本量效应）
Variability of $\overset{x}{ˉ}$ （样本均值的变异性）
Practical implication（实际应用意义）

Knowledge Point 1 — Sample Size Effect（样本量效应）

Explanation（解释）
Selecting a larger sample (e.g., 100 applicants) yields a smaller standard error than a smaller one (e.g., 30 applicants).
较大的样本（如100人）相比于较小样本（如30人）会产生更小的标准误差。
Example（例子）
$σ_{\overset{x}{ˉ}} (n = 30) = 14.6, σ_{\overset{x}{ˉ}} (n = 100) = 8.$
数值表明样本量越大，标准误差显著下降。
Extension（拓展）
With reduced standard error, $\overset{x}{ˉ}$ becomes a more stable and accurate estimator of $μ$ .
标准误差下降意味着样本均值对总体均值的估计更稳定、更准确。
Summary（总结）
Larger $n$ → Smaller $σ_{\overset{x}{ˉ}}$ → More reliable inference.
样本量越大 → 标准误差越小 → 推断越可靠。

Knowledge Point 2 — Variability of $\overset{x}{ˉ}$ （样本均值的变异性）

Explanation（解释）
Larger samples reduce variability in $\overset{x}{ˉ}$ and make it more likely to approximate $μ$ .
较大的样本量能降低样本均值的变异，使其更接近总体均值。
Example（例子）
For $n = 100$ , $\overset{x}{ˉ}$ has less spread around $μ = 1090$ than for $n = 30$ .
样本量为100时， $\overset{x}{ˉ}$ 围绕1090的分布更集中。
Extension（拓展）
Reduced variability enhances the accuracy of population estimates and strengthens confidence intervals.
变异性降低提高了总体估计的准确度，并使置信区间更窄。
Summary（总结）
Less variability implies greater consistency and stronger predictive power.
变异性越小，估计越一致，预测力越强。

Knowledge Point 3 — Practical Implication（实际应用意义）

Explanation（解释）
Researchers must balance accuracy and cost when determining sample size.
研究者在确定样本量时需平衡精度与成本。
Example（例子）
A business survey may prefer $n = 100$ for reliable results, though it costs more than $n = 30$ .
商业调查中，尽管样本量100成本更高，但能提供更可靠的结果。
Extension（拓展）
In practice, choose $n$ large enough to minimize error within resource constraints.
实务中应在资源允许范围内尽量增大样本量，以降低误差。
Summary（总结）
Larger samples yield better inference, but efficiency and feasibility must be considered.
样本越大推断越准，但需兼顾效率与可行性。

Quartz 4

Explorer

相关笔记

Lecture 14 — Sampling and Sampling Distributions

第14讲——抽样与抽样分布

Slide 1 — Sampling and Sampling Distributions

第1页——抽样与抽样分布

Knowledge Points (知识点)

Knowledge Point 1 — Sampling（抽样）

Knowledge Point 2 — Sampling Distribution（抽样分布）

Knowledge Point 3 — Key Measures: xˉ and p^​（关键统计量）

Slide 2 — Selecting a Sample (Finite Population)

第2页——抽样选择（有限总体）

Knowledge Points (知识点)

Knowledge Point 1 — Finite Population（有限总体）

Knowledge Point 2 — Equal Chance Selection（等概率抽样）

Knowledge Point 3 — Randomness and Representativeness（随机性与代表性）

Slide 3 — Selecting a Sample (Infinite Population)

第3页——抽样选择（无限总体）

Knowledge Points (知识点)

Knowledge Point 1 — Infinite Population（无限总体）

Knowledge Point 2 — Random Sampling from Infinite Population（无限总体随机抽样）

Knowledge Point 3 — Independence of Observations（观测独立性）

Slide 4 — Point Estimation

第4页——点估计

Knowledge Points (知识点)

Knowledge Point 1 — Point Estimation（点估计）

Knowledge Point 2 — Sample Statistic vs Population Parameter（样本统计量与总体参数）

Knowledge Point 3 — Unbiasedness and Efficiency（无偏性与有效性）

Slide 5 — Comments on Sampling

第5页——关于抽样的说明

Knowledge Points (知识点)

Knowledge Point 1 — Population vs Sample（总体与样本）

Knowledge Point 2 — Representativeness（样本代表性）

Knowledge Point 3 — Sampling Bias（抽样偏差）

Slide 6 — Sampling Distribution of xˉ

第6页——样本均值的抽样分布

Knowledge Points (知识点)

Knowledge Point 1 — Sampling Distribution of xˉ（样本均值的分布）

Knowledge Point 2 — Expected Value E(xˉ)=μ（样本均值的期望）

Knowledge Point 3 — Unbiased Estimator（无偏估计量）

Slide 7 — Making Statistical Inference

第7页——统计推断的过程

Knowledge Points (知识点)

Knowledge Point 1 — Relationship between xˉ and μ（样本均值与总体均值）

Knowledge Point 2 — Statistical Inference Steps（统计推断步骤）

Knowledge Point 3 — Estimation and Decision Making（估计与决策）

Slide 8 — Sampling Distribution of xˉ (Standard Error)

第8页——样本均值的抽样分布（标准误差）

Knowledge Points (知识点)

Knowledge Point 1 — Definition of Standard Error（标准误差定义）

Knowledge Point 2 — Finite Population Correction Factor（有限总体修正系数）

Knowledge Point 3 — Infinite Population Approximation（无限总体近似）

Slide 9 — Normal Distribution of xˉ (Central Limit Theorem)

第9页——样本均值的正态分布（中心极限定理）

Knowledge Points (知识点)

Knowledge Point 1 — Sampling Distribution of xˉ（样本均值的分布）

Knowledge Point 2 — Rule of Sample Size（样本量规则）

Knowledge Point 3 — Central Limit Theorem（中心极限定理）

Slide 10 — Example: St Andrew’s College (SAT Distribution)

第10页——案例：圣安德鲁学院（SAT分数抽样分布）

Knowledge Points (知识点)

Knowledge Point 1 — Sampling Distribution of xˉ（样本均值分布）

Knowledge Point 2 — Probability Estimation（概率估计）

Knowledge Point 3 — Statistical Interpretation（统计解释）

Slide 12 — Example: Effect of Sample Size on Sampling Distribution

第12页——样本量对抽样分布的影响

Knowledge Points (知识点)

Knowledge Point 1 — Relationship Between Sample Size and Standard Error（样本量与标准误差的关系）

Knowledge Point 2 — Probability Comparison: n=30 vs. n=100（样本量下的概率比较）

Knowledge Point 3 — Interpretation of Variability（变异性的解释）

Slide 13 — Relationship Between Sample Size and Sampling Distribution

第13页——样本量与抽样分布的关系总结

Knowledge Points (知识点)

Knowledge Point 1 — Sample Size Effect（样本量效应）

Knowledge Point 2 — Variability of xˉ（样本均值的变异性）

Knowledge Point 3 — Practical Implication（实际应用意义）

Graph View

Table of Contents

Backlinks

Knowledge Point 3 — Key Measures: $\overset{x}{ˉ}$ and $\overset{p}{^}$ （关键统计量）

Slide 6 — Sampling Distribution of $\overset{x}{ˉ}$

Knowledge Point 1 — Sampling Distribution of $\overset{x}{ˉ}$ （样本均值的分布）

Knowledge Point 2 — Expected Value $E (\overset{x}{ˉ}) = μ$ （样本均值的期望）

Knowledge Point 1 — Relationship between $\overset{x}{ˉ}$ and $μ$ （样本均值与总体均值）

Slide 8 — Sampling Distribution of $\overset{x}{ˉ}$ (Standard Error)

Slide 9 — Normal Distribution of $\overset{x}{ˉ}$ (Central Limit Theorem)

Knowledge Point 1 — Sampling Distribution of $\overset{x}{ˉ}$ （样本均值的分布）

Knowledge Point 1 — Sampling Distribution of $\overset{x}{ˉ}$ （样本均值分布）

Knowledge Point 2 — Probability Comparison: $n = 30$ vs. $n = 100$ （样本量下的概率比较）

Knowledge Point 2 — Variability of $\overset{x}{ˉ}$ （样本均值的变异性）