Slide 1 — MGS 2150 Business Statistics
第1页——MGS 2150 商业统计学
Knowledge Points (知识点)
- Course title: Business Statistics (课程名称:商业统计学)
- Instructor and context: Prof. Rongjuan Chen, Fall 2025, Wenzhou-Kean University
- Focus: Chapter 3 Numerical Measures (第3章:数值度量)
Explanation (解释)
- Business Statistics applies statistical methods to solve real-world business and economic problems.
- 商业统计学将统计学方法应用到现实的商业与经济问题中。
Example (例子)
- A retail company uses statistics to analyze sales data and predict consumer demand.
- 一家零售公司利用统计数据分析销售趋势并预测消费者需求。
Extension (拓展)
- Business statistics is foundational for data-driven decision-making, investment analysis, and risk management.
- 商业统计学是数据驱动决策、投资分析和风险管理的重要基础。
Summary (总结)
本页主要介绍了课程背景和主题,强调了商业统计学的应用范围与重要性。
Slide 2 — Lecture 6 Overview
第2页——第六讲概览
Knowledge Points (知识点)
- Topic: Numerical measures (数值度量)
- Measures of location (位置度量)
- Measures of variability (离散程度度量)
Explanation (解释)
- Measures of location describe the central tendency of data.
- 位置度量用于描述数据的中心趋势。
- Measures of variability describe how data values spread around the center.
- 离散程度度量用于描述数据值在中心周围的分布范围。
Example (例子)
- A company analyzing employees’ salaries may look at the mean (平均数) for central tendency and the standard deviation (标准差) for variability.
- 一家公司在分析员工工资时,既要看平均工资(中心趋势),也要看标准差(离散程度)。
Extension (拓展)
- These two types of measures complement each other: location tells “where” data center is, variability tells “how spread out” the data are.
- 位置和离散程度度量相辅相成:位置告诉我们“中心在哪里”,离散程度告诉我们“数据分布得有多广”。
Summary (总结)
本页为整章内容的框架,指出了两个核心方向:位置和离散程度。
Slide 3 — Sample vs. Population
第3页——样本与总体
Knowledge Points (知识点)
- Sample (样本) vs. Population (总体)
- Sample statistics (样本统计量)
- Population parameters (总体参数)
- Point estimation (点估计)
Explanation (解释)
- Sample: A subset of population data. 样本是总体的一部分数据。
- Population: Entire set of interest. 总体是研究关注的全部对象。
- Sample statistics: Measures calculated from a sample (e.g., sample mean).
- 样本统计量:从样本中计算得出的度量(如样本均值)。
- Population parameters: Measures describing population (e.g., population mean).
- 总体参数:描述总体特征的度量(如总体均值)。
- Point estimate: Sample statistic is used to estimate population parameter.
- 点估计:样本统计量常用于估计总体参数。
Example (例子)
- A company surveys 200 employees (sample) to estimate average salary for all 5,000 employees (population).
- 一家公司调查200名员工(样本),用以估计全体5000名员工(总体)的平均工资。
Extension (拓展)
- Sampling saves time and cost, but introduces sampling error.
- 抽样节省时间和成本,但会带来抽样误差。
- Larger and random samples generally provide more accurate estimates.
- 样本量越大且越随机,估计越准确。
Summary (总结)
本页介绍了样本与总体的区别,并强调了样本统计量在估计总体参数中的作用。
Slide 4 — Mean (均值)
第4页——均值
Knowledge Points (知识点)
- Mean (均值) is a common measure of location.
- Sample mean (样本均值) vs. Population mean (总体均值).
- Formula difference.
Explanation (解释)
- Mean represents the average value of all data points.
- 均值表示所有数据点的平均水平。
- Sample mean formula:
- Population mean formula:
Example (例子)
- In a dataset of student test scores: 80, 85, 90. Mean = (80+85+90)/3 = 85.
- 学生成绩数据:80, 85, 90。均值 = (80+85+90)/3 = 85。
Extension (拓展)
- Mean is sensitive to outliers (极端值). For skewed data, median may be a better measure.
- 均值容易受到极端值影响,在偏态数据中,中位数可能更合适。
Summary (总结)
本页讲解了均值的概念、公式和应用,强调了样本均值和总体均值的区别。
Slide 5 — Example: Apartment Rents (均值案例:公寓租金)
第5页——均值案例:公寓租金
Knowledge Points (知识点)
- Data: 70 randomly sampled efficiency apartments in a college town.
- Calculation: Sample mean = 490.80.
- Tool: Excel function =
=AVERAGE(B2:B71)
Explanation (解释)
- Real dataset is used to illustrate computation of mean.
- 用实际公寓租金数据说明如何计算均值。
- The mean rent provides the “typical” rent value in the sample.
- 平均租金代表样本中的“典型”水平。
Example (例子)
- Apartment rent dataset total = 34,356. Divide by 70 → 490.80.
- 公寓租金总额 = 34,356,除以70 = 490.80。
Extension (拓展)
- This result can be used by property managers to compare with market average, set pricing strategies, or identify deviations.
- 房地产管理者可用此均值与市场平均对比,制定价格策略或发现异常情况。
- Also, policy makers may use such data to assess student housing affordability.
- 政策制定者也能利用此数据评估学生住房负担能力。
Image/Table Analysis (图片/表格分析)
- The dataset lists 70 rent values ranging from 425 to 615.
- 数据表列出了70个租金值,范围从425到615。
- The Excel output shows: Mean = 490.80.
- Excel 计算结果为490.80。
- Implication: Although individual values vary, the average stabilizes near 491, giving a reliable measure of central location.
- 含义:虽然个别值差异较大,但平均值稳定在491左右,体现了中心位置的代表性。
Summary (总结)
本页通过具体租金案例,演示了均值的计算及其在实际商业中的意义。
Slide 6 — Median (中位数)
第6页——中位数
Knowledge Points (知识点)
- Median is the middle value when data are ordered.
- Preferred for skewed data or when extreme values exist.
- Used in income, property value, executive salaries, etc.
Explanation (解释)
- Median is the central value dividing ordered data into two halves.
- 中位数是在排序后数据中将其分为两半的中心值。
- Unlike mean, median is not affected by extreme values.
- 与均值不同,中位数不受极端值影响。
Example (例子)
- Data: 10, 12, 18 → Median = 12.
- 数据:10, 12, 18 → 中位数 = 12。
- Data: 10, 12, 100 → Mean = 40.67, Median = 12 → median better reflects typical value.
- 数据:10, 12, 100 → 均值 = 40.67,中位数 = 12 → 中位数更能反映典型水平。
Extension (拓展)
- Median is widely used in official statistics such as median household income, because it avoids distortion by outliers.
- 中位数广泛用于官方统计,例如家庭收入中位数,因为它避免了极端值的扭曲。
Summary (总结)
本页介绍了中位数的定义和适用场景,强调了其在偏态分布下的重要性。
Slide 7 — Median (Odd Number of Observations)
第7页——中位数(奇数个观测值)
Knowledge Points (知识点)
- For odd sample size, median is the middle value.
- Order data before calculation.
Explanation (解释)
- If number of observations is odd, median is at position (n+1)/2.
- 如果数据个数为奇数,中位数是位于第 (n+1)/2 个的数据值。
Example (例子)
- Data: 12, 14, 18, 19, 26, 27, 27 (n=7). Median = 4th value = 19.
- 数据:12, 14, 18, 19, 26, 27, 27 (n=7)。中位数 = 第4个值 = 19。
Extension (拓展)
- Odd-number case is straightforward; no averaging is needed.
- 奇数个观测值时,中位数直接取中间值,无需取平均。
Summary (总结)
本页强调奇数个观测值时,中位数的计算方式相对简单。
Slide 8 — Median (Even Number of Observations)
第8页——中位数(偶数个观测值)
Knowledge Points (知识点)
- For even sample size, median is the average of the two middle values.
- Order data first.
Explanation (解释)
- If number of observations is even, median = (n/2-th value + (n/2+1)-th value)/2.
- 如果数据个数为偶数,中位数 = (第 n/2 个值 + 第 (n/2+1) 个值)/2。
Example (例子)
- Data: 12, 14, 18, 19, 26, 27, 30 (n=7+1=8 after correction).
- Middle values = 19 and 26 → Median = (19+26)/2 = 22.5.
- 数据:12, 14, 18, 19, 26, 27, 30 (n=7+1=8)。
- 中间值 = 19 和 26 → 中位数 = (19+26)/2 = 22.5。
Extension (拓展)
- Averaging makes the result robust to data symmetry, ensuring median remains central.
- 取平均保证了结果的对称性,使中位数始终位于数据中心。
Summary (总结)
本页解释了偶数个观测值时,中位数的计算方法。
Slide 9 — Example: Apartment Rents Median
第9页——公寓租金案例:中位数
Knowledge Points (知识点)
- Data: 70 apartment rents ordered.
- Median calculated = 475.
- Tool: Excel function =
=MEDIAN(B2:B71).
Explanation (解释)
- By ordering 70 rent values, the median rent is the 35th and 36th average.
- 将70个租金数据排序,中位数取第35个和第36个的平均值。
Example (例子)
- Rent values around the middle are 475 and 475 → Median = 475.
- 中间位置的租金值为475和475 → 中位数 = 475。
Extension (拓展)
- Median reflects the typical student apartment rent without being skewed by extreme values (e.g., 615).
- 中位数反映了典型的学生公寓租金,不受极端值(如615)的影响。
- Useful for policymakers to judge affordability.
- 对政策制定者评估租金负担能力非常有用。
Image/Table Analysis (图片/表格分析)
- Ordered dataset ranges 425–615.
- 排序后数据范围为425–615。
- Median = 475 sits centrally, showing half rents ≤ 475, half ≥ 475.
- 中位数 = 475,意味着一半租金不超过475,一半不低于475。
Summary (总结)
本页通过租金案例展示了中位数的计算与意义,尤其在租金分布不对称时更具代表性。
Slide 10 — Trimmed Mean (截尾均值)
第10页——截尾均值
Knowledge Points (知识点)
- Trimmed mean deals with extreme values.
- Removes lowest and highest % of data before averaging.
- Example: 5% trimmed mean removes smallest and largest 5%.
Explanation (解释)
- Trimmed mean provides a balance between mean and median.
- 截尾均值是均值与中位数的折中方法。
- Formula: Sort data → Remove bottom p% and top p% → Compute mean of remaining.
- 公式:排序 → 去掉最小p%和最大p% → 计算剩余数据的均值。
Example (例子)
- Data: [10, 12, 14, 100]. Mean = 34, Median = 13, 25% trimmed mean = (12+14)/2 = 13.
- 数据:[10, 12, 14, 100]。均值=34,中位数=13,25%截尾均值=(12+14)/2=13。
Extension (拓展)
- Often used in sports scoring (e.g., gymnastics, diving) to avoid bias by extreme judges.
- 常用于体育打分(如体操、跳水),避免极端裁判打分的偏差。
- Also applied in financial data when a few extreme outliers may distort results.
- 在金融数据分析中,也可避免极端值的干扰。
Summary (总结)
本页介绍了截尾均值的概念,强调其在处理极端值时的重要性。
Slide 11 — Mode (众数)
第11页——众数
Knowledge Points (知识点)
- Mode = value(s) with the highest frequency in a dataset.
- Possible types: unimodal, bimodal, multimodal.
- Excel returns the first occurring mode only.
Explanation (解释)
- Mode reflects the most common or typical value in a dataset.
- 众数表示在数据集中出现频率最高的数值。
- Multiple modes can exist (two = bimodal, more than two = multimodal).
- 数据集中可能有多个众数(两个 = 双峰分布,多个 = 多峰分布)。
Example (例子)
- Data: [2, 3, 3, 4, 5, 5]. → Two modes: 3 and 5.
- 数据:[2, 3, 3, 4, 5, 5] → 众数 = 3 和 5。
Extension (拓展)
- Mode is especially useful for categorical or discrete data (e.g., product size preference: S, M, L).
- 众数对分类数据或离散数据特别有用(如产品尺码偏好:S、M、L)。
- In continuous data, the mode may approximate peak of distribution.
- 对连续数据,众数常接近分布的峰值。
Summary (总结)
本页介绍了众数的概念与分类,强调了其在反映最常见数值时的作用。
Slide 12 — Example: Apartment Rents Mode
第12页——公寓租金案例:众数
Knowledge Points (知识点)
- Dataset: 70 apartment rents.
- Mode = 450.
- Tool: Excel function =
=MODE.SNGL(B2:B71).
Explanation (解释)
- 450 appears most frequently in the dataset, making it the mode.
- 在数据中,450出现次数最多,因此为众数。
Example (例子)
- Rent list shows multiple 450 values scattered across ordered sequence.
- 租金列表中有多个450分布在不同位置,说明其最为常见。
Extension (拓展)
- Landlords may set rent close to the mode to align with the most common market rate.
- 房东可将租金定在众数附近,以接近市场主流价。
- Students searching for apartments likely encounter 450 rent more often, making it a reference point.
- 学生找房时最常见的租金水平就是450,可作为市场参考。
Image/Table Analysis (图片/表格分析)
- Ordered data shows clustering around 450.
- 排序后的数据在450附近有明显集中。
- Visual inspection confirms 450’s dominance.
- 直观观察也能确认450的频率最高。
Summary (总结)
本页通过租金案例演示了众数的计算及其在市场分析中的意义。
Slide 13 — Excel Computation: Mean, Median, Mode
第13页——Excel 计算:均值、中位数、众数
Knowledge Points (知识点)
- Excel functions:
- Mean =
=AVERAGE(B2:B71) - Median =
=MEDIAN(B2:B71) - Mode =
=MODE.SNGL(B2:B71)
- Mean =
- Computed results:
- Mean = 490.80
- Median = 475.00
- Mode = 450.00
Explanation (解释)
- Excel provides built-in functions to quickly compute location measures.
- Excel 提供内置函数,可快速计算位置度量。
- This reduces manual errors and improves efficiency.
- 这样既减少手工误差,又提升效率。
Example (例子)
- In B2:B71 dataset, Excel outputs mean=490.80, median=475, mode=450.
- 在 B2:B71 数据集中,Excel 计算得出均值=490.80,中位数=475,众数=450。
Extension (拓展)
- Analysts often use Excel for preliminary exploration before advanced software like R or Python.
- 分析师通常先用Excel做初步分析,再转向R或Python等高级软件。
- Easy functions make it accessible for non-technical managers.
- 简便的函数让非技术型管理者也能轻松使用。
Image/Table Analysis (图片/表格分析)
- Table structure: column A = apartment ID, column B = rent values.
- 表格结构:A列 = 公寓编号,B列 = 租金值。
- Computation summary displayed in side cells: 490.80, 475.00, 450.00.
- 计算结果直接显示在单元格:490.80、475.00、450.00。
Summary (总结)
本页展示了Excel中均值、中位数、众数的快捷计算方法及其应用结果。
Slide 14 — Percentiles (百分位数)
第14页——百分位数
Knowledge Points (知识点)
- Percentile = value dividing dataset into 100 equal parts.
- 25th percentile (Q1), 50th percentile (median), 75th percentile (Q3).
- Shows how data are distributed across range.
Explanation (解释)
- Percentiles provide thresholds showing what percentage of data lies below a value.
- 百分位数表示有多少比例的数据在某一数值以下。
- Example: 25th percentile = at least 25% values ≤ this number.
- 例:第25百分位数 = 至少25%的数据值 ≤ 该数。
Example (例子)
- Student test scores: min=68, max=100. 25th=76, 50th=83, 75th=95.
- 学生成绩:最小=68,最大=100。25百分位=76,50百分位=83,75百分位=95。
Extension (拓展)
- Widely used in education (grading), healthcare (growth charts), finance (risk analysis).
- 百分位广泛应用于教育(打分)、医疗(成长曲线)、金融(风险分析)。
- Percentiles highlight inequality in distributions (e.g., income).
- 百分位能揭示分布的不平衡(如收入分配)。
Summary (总结)
本页讲解了百分位数的概念和应用,强调其在分布分析中的作用。
Slide 15 — Example: Apartment Rents 80th Percentile
第15页——公寓租金案例:第80百分位数
Knowledge Points (知识点)
- Formula: i = (p/100) * n.
- For p=80, n=70 → i = 56.
- 80th percentile = average of 56th and 57th values = (535+549)/2 = 542.
Explanation (解释)
- Percentile position depends on dataset size.
- 百分位数位置取决于数据规模。
- For 70 observations, the 80th percentile falls between 56th and 57th values.
- 在70个观测值中,第80百分位数位于第56与第57个值之间。
Example (例子)
- Apartment rent dataset → 80th percentile = 542.
- 公寓租金数据 → 第80百分位数 = 542。
Extension (拓展)
- Higher percentiles show what top earners or top rents look like.
- 较高百分位数反映了顶端租金水平。
- Landlords may use 80th percentile to set premium apartment pricing.
- 房东可根据第80百分位数制定高端租金价格。
Image/Table Analysis (图片/表格分析)
- Ordered dataset: values near 535 and 549 mark the 80th percentile cutoff.
- 排序数据中,第80百分位数位于535与549之间。
- Implication: 20% of apartments charge ≥ 542, showing premium market segment.
- 含义:20%的公寓租金 ≥ 542,代表高端市场。
Summary (总结)
本页通过租金数据案例展示了百分位数的计算步骤与实际意义。
Slide 16 — Quartiles (四分位数)
第16页——四分位数
Knowledge Points (知识点)
- Quartiles are specific percentiles dividing data into four equal parts.
- Q1 = 25th percentile, Q2 = 50th percentile (median), Q3 = 75th percentile.
- Used to summarize distribution and detect outliers.
Explanation (解释)
- Quartiles divide dataset into four sections, each containing 25% of observations.
- 四分位数将数据集分为四部分,每部分占总数的25%。
- Q1 marks lower quarter, Q2 is the median, Q3 marks upper quarter.
- Q1代表下四分位点,Q2是中位数,Q3代表上四分位点。
Example (例子)
- Student scores: [60, 70, 75, 80, 85, 90, 95, 100].
- Q1 = 72.5, Q2 = 82.5, Q3 = 92.5.
- 学生成绩:[60, 70, 75, 80, 85, 90, 95, 100]。
- Q1=72.5, Q2=82.5, Q3=92.5。
Extension (拓展)
- Quartiles are essential for constructing boxplots (箱线图), a common tool to visualize spread and detect outliers.
- 四分位数是绘制箱线图的基础,可用于可视化分布并检测离群值。
- Widely used in finance (stock returns), business analytics (sales distribution), and social sciences (income inequality).
- 广泛应用于金融(股票收益)、商业分析(销售分布)、社会科学(收入不平等)。
Summary (总结)
本页介绍了四分位数的概念,说明其与百分位数的关系及应用价值。
Slide 17 — Example: Apartment Rents 75th Percentile (Q3)
第17页——公寓租金案例:第75百分位数(Q3)
Knowledge Points (知识点)
- Formula: i = (p/100) * n.
- For p=75, n=70 → i = 52.5 → round to 53rd value.
- Q3 = 525.
Explanation (解释)
- The 75th percentile divides bottom 75% and top 25% of data.
- 第75百分位数将数据分为下75%与上25%。
- In ordered apartment rent dataset, Q3 = 525.
- 在排序的公寓租金数据中,Q3 = 525。
Example (例子)
- If rents range 425–615, then Q3 = 525 shows that 75% of apartments rent ≤ 525, and 25% rent ≥ 525.
- 租金范围425–615,中位Q3=525 → 表示75%的公寓租金 ≤ 525,25%的公寓 ≥ 525。
Extension (拓展)
- Q3 is often used to determine the “upper threshold” for reasonable values.
- Q3常用来确定合理值的上限。
- Combined with Q1, interquartile range (IQR = Q3–Q1) measures variability.
- 与Q1结合,四分位距(IQR=Q3–Q1)可衡量数据离散程度。
- In housing markets, Q3 indicates higher-end rental segment.
- 在住房市场,Q3代表较高端租金水平。
Image/Table Analysis (图片/表格分析)
- Ordered dataset shows Q3 at the 53rd observation = 525.
- 排序数据中,第53个观测值 = 525。
- This indicates that one-quarter of apartments are priced above 525, showing the “premium” market.
- 表明有四分之一的公寓租金 ≥ 525,属于高端市场。
Summary (总结)
本页通过租金案例展示了Q3的计算方法和解释,突出其在市场细分分析中的价值。
Slide 18 — Lecture 6 Summary
第18页——第六讲总结
Knowledge Points (知识点)
- Measures of location: Mean, Median, Mode, Trimmed Mean, Percentiles, Quartiles.
- Each measure has strengths and weaknesses, suited for different scenarios.
- Practical examples: Apartment rents dataset illustrates all concepts.
Explanation (解释)
- Location measures summarize the central tendency of data.
- 位置度量用于总结数据的中心趋势。
- Choice depends on data distribution:
- Mean → sensitive to outliers.
- Median → robust under skewed data.
- Mode → useful for categorical/discrete data.
- Trimmed mean → balances robustness and precision.
- Percentiles/Quartiles → provide deeper distribution insights.
- 不同度量适合不同场景:均值对极端值敏感;中位数适合偏态分布;众数适合分类/离散数据;截尾均值兼顾鲁棒性和精确度;百分位数和四分位数能揭示分布细节。
Example (例子)
- In apartment rent dataset:
- Mean = 490.80
- Median = 475
- Mode = 450
- 80th percentile = 542
- Q3 = 525
- 在公寓租金案例中:
- 均值=490.80
- 中位数=475
- 众数=450
- 第80百分位数=542
- Q3=525
Extension (拓展)
- Business decisions:
- Mean → set average rent expectations.
- Median → assess affordability for majority.
- Percentiles → identify luxury vs. economy market segments.
- 商业决策:
- 均值 → 制定平均租金水平预期
- 中位数 → 评估大多数租客的负担能力
- 百分位数 → 确定高端与低端市场细分
Summary (总结)
本页总结了Lecture 6的全部内容,强调了各种位置度量工具在实际数据分析中的应用场景与决策意义。