Slide 1 — MGS 2150 Business Statistics

第1页——MGS 2150 商业统计学

Knowledge Points (知识点)

  1. Course title: Business Statistics (课程名称:商业统计学)
  2. Instructor and context: Prof. Rongjuan Chen, Fall 2025, Wenzhou-Kean University
  3. Focus: Chapter 3 Numerical Measures (第3章:数值度量)

Explanation (解释)

  • Business Statistics applies statistical methods to solve real-world business and economic problems.
  • 商业统计学将统计学方法应用到现实的商业与经济问题中。

Example (例子)

  • A retail company uses statistics to analyze sales data and predict consumer demand.
  • 一家零售公司利用统计数据分析销售趋势并预测消费者需求。

Extension (拓展)

  • Business statistics is foundational for data-driven decision-making, investment analysis, and risk management.
  • 商业统计学是数据驱动决策、投资分析和风险管理的重要基础。

Summary (总结)

本页主要介绍了课程背景和主题,强调了商业统计学的应用范围与重要性。


Slide 2 — Lecture 6 Overview

第2页——第六讲概览

Knowledge Points (知识点)

  1. Topic: Numerical measures (数值度量)
  2. Measures of location (位置度量)
  3. Measures of variability (离散程度度量)

Explanation (解释)

  • Measures of location describe the central tendency of data.
  • 位置度量用于描述数据的中心趋势。
  • Measures of variability describe how data values spread around the center.
  • 离散程度度量用于描述数据值在中心周围的分布范围。

Example (例子)

  • A company analyzing employees’ salaries may look at the mean (平均数) for central tendency and the standard deviation (标准差) for variability.
  • 一家公司在分析员工工资时,既要看平均工资(中心趋势),也要看标准差(离散程度)。

Extension (拓展)

  • These two types of measures complement each other: location tells “where” data center is, variability tells “how spread out” the data are.
  • 位置和离散程度度量相辅相成:位置告诉我们“中心在哪里”,离散程度告诉我们“数据分布得有多广”。

Summary (总结)

本页为整章内容的框架,指出了两个核心方向:位置和离散程度。


Slide 3 — Sample vs. Population

第3页——样本与总体

Knowledge Points (知识点)

  1. Sample (样本) vs. Population (总体)
  2. Sample statistics (样本统计量)
  3. Population parameters (总体参数)
  4. Point estimation (点估计)

Explanation (解释)

  • Sample: A subset of population data. 样本是总体的一部分数据。
  • Population: Entire set of interest. 总体是研究关注的全部对象。
  • Sample statistics: Measures calculated from a sample (e.g., sample mean).
  • 样本统计量:从样本中计算得出的度量(如样本均值)。
  • Population parameters: Measures describing population (e.g., population mean).
  • 总体参数:描述总体特征的度量(如总体均值)。
  • Point estimate: Sample statistic is used to estimate population parameter.
  • 点估计:样本统计量常用于估计总体参数。

Example (例子)

  • A company surveys 200 employees (sample) to estimate average salary for all 5,000 employees (population).
  • 一家公司调查200名员工(样本),用以估计全体5000名员工(总体)的平均工资。

Extension (拓展)

  • Sampling saves time and cost, but introduces sampling error.
  • 抽样节省时间和成本,但会带来抽样误差。
  • Larger and random samples generally provide more accurate estimates.
  • 样本量越大且越随机,估计越准确。

Summary (总结)

本页介绍了样本与总体的区别,并强调了样本统计量在估计总体参数中的作用。


Slide 4 — Mean (均值)

第4页——均值

Knowledge Points (知识点)

  1. Mean (均值) is a common measure of location.
  2. Sample mean (样本均值) vs. Population mean (总体均值).
  3. Formula difference.

Explanation (解释)

  • Mean represents the average value of all data points.
  • 均值表示所有数据点的平均水平。
  • Sample mean formula:
  • Population mean formula:

Example (例子)

  • In a dataset of student test scores: 80, 85, 90. Mean = (80+85+90)/3 = 85.
  • 学生成绩数据:80, 85, 90。均值 = (80+85+90)/3 = 85。

Extension (拓展)

  • Mean is sensitive to outliers (极端值). For skewed data, median may be a better measure.
  • 均值容易受到极端值影响,在偏态数据中,中位数可能更合适。

Summary (总结)

本页讲解了均值的概念、公式和应用,强调了样本均值和总体均值的区别。


Slide 5 — Example: Apartment Rents (均值案例:公寓租金)

第5页——均值案例:公寓租金

Knowledge Points (知识点)

  1. Data: 70 randomly sampled efficiency apartments in a college town.
  2. Calculation: Sample mean = 490.80.
  3. Tool: Excel function = =AVERAGE(B2:B71)

Explanation (解释)

  • Real dataset is used to illustrate computation of mean.
  • 用实际公寓租金数据说明如何计算均值。
  • The mean rent provides the “typical” rent value in the sample.
  • 平均租金代表样本中的“典型”水平。

Example (例子)

  • Apartment rent dataset total = 34,356. Divide by 70 → 490.80.
  • 公寓租金总额 = 34,356,除以70 = 490.80。

Extension (拓展)

  • This result can be used by property managers to compare with market average, set pricing strategies, or identify deviations.
  • 房地产管理者可用此均值与市场平均对比,制定价格策略或发现异常情况。
  • Also, policy makers may use such data to assess student housing affordability.
  • 政策制定者也能利用此数据评估学生住房负担能力。

Image/Table Analysis (图片/表格分析)

  • The dataset lists 70 rent values ranging from 425 to 615.
  • 数据表列出了70个租金值,范围从425到615。
  • The Excel output shows: Mean = 490.80.
  • Excel 计算结果为490.80。
  • Implication: Although individual values vary, the average stabilizes near 491, giving a reliable measure of central location.
  • 含义:虽然个别值差异较大,但平均值稳定在491左右,体现了中心位置的代表性。

Summary (总结)

本页通过具体租金案例,演示了均值的计算及其在实际商业中的意义。


Slide 6 — Median (中位数)

第6页——中位数

Knowledge Points (知识点)

  1. Median is the middle value when data are ordered.
  2. Preferred for skewed data or when extreme values exist.
  3. Used in income, property value, executive salaries, etc.

Explanation (解释)

  • Median is the central value dividing ordered data into two halves.
  • 中位数是在排序后数据中将其分为两半的中心值。
  • Unlike mean, median is not affected by extreme values.
  • 与均值不同,中位数不受极端值影响。

Example (例子)

  • Data: 10, 12, 18 → Median = 12.
  • 数据:10, 12, 18 → 中位数 = 12。
  • Data: 10, 12, 100 → Mean = 40.67, Median = 12 → median better reflects typical value.
  • 数据:10, 12, 100 → 均值 = 40.67,中位数 = 12 → 中位数更能反映典型水平。

Extension (拓展)

  • Median is widely used in official statistics such as median household income, because it avoids distortion by outliers.
  • 中位数广泛用于官方统计,例如家庭收入中位数,因为它避免了极端值的扭曲。

Summary (总结)

本页介绍了中位数的定义和适用场景,强调了其在偏态分布下的重要性。


Slide 7 — Median (Odd Number of Observations)

第7页——中位数(奇数个观测值)

Knowledge Points (知识点)

  1. For odd sample size, median is the middle value.
  2. Order data before calculation.

Explanation (解释)

  • If number of observations is odd, median is at position (n+1)/2.
  • 如果数据个数为奇数,中位数是位于第 (n+1)/2 个的数据值。

Example (例子)

  • Data: 12, 14, 18, 19, 26, 27, 27 (n=7). Median = 4th value = 19.
  • 数据:12, 14, 18, 19, 26, 27, 27 (n=7)。中位数 = 第4个值 = 19。

Extension (拓展)

  • Odd-number case is straightforward; no averaging is needed.
  • 奇数个观测值时,中位数直接取中间值,无需取平均。

Summary (总结)

本页强调奇数个观测值时,中位数的计算方式相对简单。


Slide 8 — Median (Even Number of Observations)

第8页——中位数(偶数个观测值)

Knowledge Points (知识点)

  1. For even sample size, median is the average of the two middle values.
  2. Order data first.

Explanation (解释)

  • If number of observations is even, median = (n/2-th value + (n/2+1)-th value)/2.
  • 如果数据个数为偶数,中位数 = (第 n/2 个值 + 第 (n/2+1) 个值)/2。

Example (例子)

  • Data: 12, 14, 18, 19, 26, 27, 30 (n=7+1=8 after correction).
  • Middle values = 19 and 26 → Median = (19+26)/2 = 22.5.
  • 数据:12, 14, 18, 19, 26, 27, 30 (n=7+1=8)。
  • 中间值 = 19 和 26 → 中位数 = (19+26)/2 = 22.5。

Extension (拓展)

  • Averaging makes the result robust to data symmetry, ensuring median remains central.
  • 取平均保证了结果的对称性,使中位数始终位于数据中心。

Summary (总结)

本页解释了偶数个观测值时,中位数的计算方法。


Slide 9 — Example: Apartment Rents Median

第9页——公寓租金案例:中位数

Knowledge Points (知识点)

  1. Data: 70 apartment rents ordered.
  2. Median calculated = 475.
  3. Tool: Excel function = =MEDIAN(B2:B71).

Explanation (解释)

  • By ordering 70 rent values, the median rent is the 35th and 36th average.
  • 将70个租金数据排序,中位数取第35个和第36个的平均值。

Example (例子)

  • Rent values around the middle are 475 and 475 → Median = 475.
  • 中间位置的租金值为475和475 → 中位数 = 475。

Extension (拓展)

  • Median reflects the typical student apartment rent without being skewed by extreme values (e.g., 615).
  • 中位数反映了典型的学生公寓租金,不受极端值(如615)的影响。
  • Useful for policymakers to judge affordability.
  • 对政策制定者评估租金负担能力非常有用。

Image/Table Analysis (图片/表格分析)

  • Ordered dataset ranges 425–615.
  • 排序后数据范围为425–615。
  • Median = 475 sits centrally, showing half rents ≤ 475, half ≥ 475.
  • 中位数 = 475,意味着一半租金不超过475,一半不低于475。

Summary (总结)

本页通过租金案例展示了中位数的计算与意义,尤其在租金分布不对称时更具代表性。


Slide 10 — Trimmed Mean (截尾均值)

第10页——截尾均值

Knowledge Points (知识点)

  1. Trimmed mean deals with extreme values.
  2. Removes lowest and highest % of data before averaging.
  3. Example: 5% trimmed mean removes smallest and largest 5%.

Explanation (解释)

  • Trimmed mean provides a balance between mean and median.
  • 截尾均值是均值与中位数的折中方法。
  • Formula: Sort data → Remove bottom p% and top p% → Compute mean of remaining.
  • 公式:排序 → 去掉最小p%和最大p% → 计算剩余数据的均值。

Example (例子)

  • Data: [10, 12, 14, 100]. Mean = 34, Median = 13, 25% trimmed mean = (12+14)/2 = 13.
  • 数据:[10, 12, 14, 100]。均值=34,中位数=13,25%截尾均值=(12+14)/2=13。

Extension (拓展)

  • Often used in sports scoring (e.g., gymnastics, diving) to avoid bias by extreme judges.
  • 常用于体育打分(如体操、跳水),避免极端裁判打分的偏差。
  • Also applied in financial data when a few extreme outliers may distort results.
  • 在金融数据分析中,也可避免极端值的干扰。

Summary (总结)

本页介绍了截尾均值的概念,强调其在处理极端值时的重要性。


Slide 11 — Mode (众数)

第11页——众数

Knowledge Points (知识点)

  1. Mode = value(s) with the highest frequency in a dataset.
  2. Possible types: unimodal, bimodal, multimodal.
  3. Excel returns the first occurring mode only.

Explanation (解释)

  • Mode reflects the most common or typical value in a dataset.
  • 众数表示在数据集中出现频率最高的数值。
  • Multiple modes can exist (two = bimodal, more than two = multimodal).
  • 数据集中可能有多个众数(两个 = 双峰分布,多个 = 多峰分布)。

Example (例子)

  • Data: [2, 3, 3, 4, 5, 5]. → Two modes: 3 and 5.
  • 数据:[2, 3, 3, 4, 5, 5] → 众数 = 3 和 5。

Extension (拓展)

  • Mode is especially useful for categorical or discrete data (e.g., product size preference: S, M, L).
  • 众数对分类数据或离散数据特别有用(如产品尺码偏好:S、M、L)。
  • In continuous data, the mode may approximate peak of distribution.
  • 对连续数据,众数常接近分布的峰值。

Summary (总结)

本页介绍了众数的概念与分类,强调了其在反映最常见数值时的作用。


Slide 12 — Example: Apartment Rents Mode

第12页——公寓租金案例:众数

Knowledge Points (知识点)

  1. Dataset: 70 apartment rents.
  2. Mode = 450.
  3. Tool: Excel function = =MODE.SNGL(B2:B71).

Explanation (解释)

  • 450 appears most frequently in the dataset, making it the mode.
  • 在数据中,450出现次数最多,因此为众数。

Example (例子)

  • Rent list shows multiple 450 values scattered across ordered sequence.
  • 租金列表中有多个450分布在不同位置,说明其最为常见。

Extension (拓展)

  • Landlords may set rent close to the mode to align with the most common market rate.
  • 房东可将租金定在众数附近,以接近市场主流价。
  • Students searching for apartments likely encounter 450 rent more often, making it a reference point.
  • 学生找房时最常见的租金水平就是450,可作为市场参考。

Image/Table Analysis (图片/表格分析)

  • Ordered data shows clustering around 450.
  • 排序后的数据在450附近有明显集中。
  • Visual inspection confirms 450’s dominance.
  • 直观观察也能确认450的频率最高。

Summary (总结)

本页通过租金案例演示了众数的计算及其在市场分析中的意义。


Slide 13 — Excel Computation: Mean, Median, Mode

第13页——Excel 计算:均值、中位数、众数

Knowledge Points (知识点)

  1. Excel functions:
    • Mean = =AVERAGE(B2:B71)
    • Median = =MEDIAN(B2:B71)
    • Mode = =MODE.SNGL(B2:B71)
  2. Computed results:
    • Mean = 490.80
    • Median = 475.00
    • Mode = 450.00

Explanation (解释)

  • Excel provides built-in functions to quickly compute location measures.
  • Excel 提供内置函数,可快速计算位置度量。
  • This reduces manual errors and improves efficiency.
  • 这样既减少手工误差,又提升效率。

Example (例子)

  • In B2:B71 dataset, Excel outputs mean=490.80, median=475, mode=450.
  • 在 B2:B71 数据集中,Excel 计算得出均值=490.80,中位数=475,众数=450。

Extension (拓展)

  • Analysts often use Excel for preliminary exploration before advanced software like R or Python.
  • 分析师通常先用Excel做初步分析,再转向R或Python等高级软件。
  • Easy functions make it accessible for non-technical managers.
  • 简便的函数让非技术型管理者也能轻松使用。

Image/Table Analysis (图片/表格分析)

  • Table structure: column A = apartment ID, column B = rent values.
  • 表格结构:A列 = 公寓编号,B列 = 租金值。
  • Computation summary displayed in side cells: 490.80, 475.00, 450.00.
  • 计算结果直接显示在单元格:490.80、475.00、450.00。

Summary (总结)

本页展示了Excel中均值、中位数、众数的快捷计算方法及其应用结果。


Slide 14 — Percentiles (百分位数)

第14页——百分位数

Knowledge Points (知识点)

  1. Percentile = value dividing dataset into 100 equal parts.
  2. 25th percentile (Q1), 50th percentile (median), 75th percentile (Q3).
  3. Shows how data are distributed across range.

Explanation (解释)

  • Percentiles provide thresholds showing what percentage of data lies below a value.
  • 百分位数表示有多少比例的数据在某一数值以下。
  • Example: 25th percentile = at least 25% values ≤ this number.
  • 例:第25百分位数 = 至少25%的数据值 ≤ 该数。

Example (例子)

  • Student test scores: min=68, max=100. 25th=76, 50th=83, 75th=95.
  • 学生成绩:最小=68,最大=100。25百分位=76,50百分位=83,75百分位=95。

Extension (拓展)

  • Widely used in education (grading), healthcare (growth charts), finance (risk analysis).
  • 百分位广泛应用于教育(打分)、医疗(成长曲线)、金融(风险分析)。
  • Percentiles highlight inequality in distributions (e.g., income).
  • 百分位能揭示分布的不平衡(如收入分配)。

Summary (总结)

本页讲解了百分位数的概念和应用,强调其在分布分析中的作用。


Slide 15 — Example: Apartment Rents 80th Percentile

第15页——公寓租金案例:第80百分位数

Knowledge Points (知识点)

  1. Formula: i = (p/100) * n.
  2. For p=80, n=70 → i = 56.
  3. 80th percentile = average of 56th and 57th values = (535+549)/2 = 542.

Explanation (解释)

  • Percentile position depends on dataset size.
  • 百分位数位置取决于数据规模。
  • For 70 observations, the 80th percentile falls between 56th and 57th values.
  • 在70个观测值中,第80百分位数位于第56与第57个值之间。

Example (例子)

  • Apartment rent dataset → 80th percentile = 542.
  • 公寓租金数据 → 第80百分位数 = 542。

Extension (拓展)

  • Higher percentiles show what top earners or top rents look like.
  • 较高百分位数反映了顶端租金水平。
  • Landlords may use 80th percentile to set premium apartment pricing.
  • 房东可根据第80百分位数制定高端租金价格。

Image/Table Analysis (图片/表格分析)

  • Ordered dataset: values near 535 and 549 mark the 80th percentile cutoff.
  • 排序数据中,第80百分位数位于535与549之间。
  • Implication: 20% of apartments charge ≥ 542, showing premium market segment.
  • 含义:20%的公寓租金 ≥ 542,代表高端市场。

Summary (总结)

本页通过租金数据案例展示了百分位数的计算步骤与实际意义。


Slide 16 — Quartiles (四分位数)

第16页——四分位数

Knowledge Points (知识点)

  1. Quartiles are specific percentiles dividing data into four equal parts.
  2. Q1 = 25th percentile, Q2 = 50th percentile (median), Q3 = 75th percentile.
  3. Used to summarize distribution and detect outliers.

Explanation (解释)

  • Quartiles divide dataset into four sections, each containing 25% of observations.
  • 四分位数将数据集分为四部分,每部分占总数的25%。
  • Q1 marks lower quarter, Q2 is the median, Q3 marks upper quarter.
  • Q1代表下四分位点,Q2是中位数,Q3代表上四分位点。

Example (例子)

  • Student scores: [60, 70, 75, 80, 85, 90, 95, 100].
  • Q1 = 72.5, Q2 = 82.5, Q3 = 92.5.
  • 学生成绩:[60, 70, 75, 80, 85, 90, 95, 100]。
  • Q1=72.5, Q2=82.5, Q3=92.5。

Extension (拓展)

  • Quartiles are essential for constructing boxplots (箱线图), a common tool to visualize spread and detect outliers.
  • 四分位数是绘制箱线图的基础,可用于可视化分布并检测离群值。
  • Widely used in finance (stock returns), business analytics (sales distribution), and social sciences (income inequality).
  • 广泛应用于金融(股票收益)、商业分析(销售分布)、社会科学(收入不平等)。

Summary (总结)

本页介绍了四分位数的概念,说明其与百分位数的关系及应用价值。


Slide 17 — Example: Apartment Rents 75th Percentile (Q3)

第17页——公寓租金案例:第75百分位数(Q3)

Knowledge Points (知识点)

  1. Formula: i = (p/100) * n.
  2. For p=75, n=70 → i = 52.5 → round to 53rd value.
  3. Q3 = 525.

Explanation (解释)

  • The 75th percentile divides bottom 75% and top 25% of data.
  • 第75百分位数将数据分为下75%与上25%。
  • In ordered apartment rent dataset, Q3 = 525.
  • 在排序的公寓租金数据中,Q3 = 525。

Example (例子)

  • If rents range 425–615, then Q3 = 525 shows that 75% of apartments rent ≤ 525, and 25% rent ≥ 525.
  • 租金范围425–615,中位Q3=525 → 表示75%的公寓租金 ≤ 525,25%的公寓 ≥ 525。

Extension (拓展)

  • Q3 is often used to determine the “upper threshold” for reasonable values.
  • Q3常用来确定合理值的上限。
  • Combined with Q1, interquartile range (IQR = Q3–Q1) measures variability.
  • 与Q1结合,四分位距(IQR=Q3–Q1)可衡量数据离散程度。
  • In housing markets, Q3 indicates higher-end rental segment.
  • 在住房市场,Q3代表较高端租金水平。

Image/Table Analysis (图片/表格分析)

  • Ordered dataset shows Q3 at the 53rd observation = 525.
  • 排序数据中,第53个观测值 = 525。
  • This indicates that one-quarter of apartments are priced above 525, showing the “premium” market.
  • 表明有四分之一的公寓租金 ≥ 525,属于高端市场。

Summary (总结)

本页通过租金案例展示了Q3的计算方法和解释,突出其在市场细分分析中的价值。


Slide 18 — Lecture 6 Summary

第18页——第六讲总结

Knowledge Points (知识点)

  1. Measures of location: Mean, Median, Mode, Trimmed Mean, Percentiles, Quartiles.
  2. Each measure has strengths and weaknesses, suited for different scenarios.
  3. Practical examples: Apartment rents dataset illustrates all concepts.

Explanation (解释)

  • Location measures summarize the central tendency of data.
  • 位置度量用于总结数据的中心趋势。
  • Choice depends on data distribution:
    • Mean → sensitive to outliers.
    • Median → robust under skewed data.
    • Mode → useful for categorical/discrete data.
    • Trimmed mean → balances robustness and precision.
    • Percentiles/Quartiles → provide deeper distribution insights.
  • 不同度量适合不同场景:均值对极端值敏感;中位数适合偏态分布;众数适合分类/离散数据;截尾均值兼顾鲁棒性和精确度;百分位数和四分位数能揭示分布细节。

Example (例子)

  • In apartment rent dataset:
    • Mean = 490.80
    • Median = 475
    • Mode = 450
    • 80th percentile = 542
    • Q3 = 525
  • 在公寓租金案例中:
    • 均值=490.80
    • 中位数=475
    • 众数=450
    • 第80百分位数=542
    • Q3=525

Extension (拓展)

  • Business decisions:
    • Mean → set average rent expectations.
    • Median → assess affordability for majority.
    • Percentiles → identify luxury vs. economy market segments.
  • 商业决策:
    • 均值 → 制定平均租金水平预期
    • 中位数 → 评估大多数租客的负担能力
    • 百分位数 → 确定高端与低端市场细分

Summary (总结)

本页总结了Lecture 6的全部内容,强调了各种位置度量工具在实际数据分析中的应用场景与决策意义。